Principal component regression vs. partial linear squares regression in prediction modelling

Jurina, Tamara and Valinger, Davor and Gajdos Kljusuric, Jasenka and Benkovic, Maja and Jurinjak Tusek, Ana and Kurtanjek, Zelimir and Antoska-Knights, Vesna (2019) Principal component regression vs. partial linear squares regression in prediction modelling. In: 24th International Scientific Symposium on Biometrics.

Full text not available from this repository.


Principal component regression (PCR) and partial least squares regression (PLSR) are
mostly used multivariate analysis tools in the chemometrics. Challenge is to analyse
superiority of one over another. As example herbal extract of melissa was used in this
study. On melissa extract samples spectra analysis (Ultraviolet–visible spectroscopy, UVVIS and near infrared spectroscopy, NIR) and the content of total phenols (TP) were
The UV and NIR absorbance spectra of the aqueous extracts were gathered at three
temperatures (T = 40, 60 and 80°C) in time interval from 0.5 to 90 min and were subject of
PCR and PLS models. Models were tested for UV spectra range, for NIR spectra range and
for the UV+NIR spectra range and the models refinement procedure and validation was
performed by cross-validation. For the model efficiency analysis parameters as R-squared,
root mean squared error of prediction RMSE, adjusted R2, Ratio of standard error of
Performance to standard Deviation (RPD) and the Range Error Ratio (RER) were used. R2is
describing how well the experimental data fit the statistical model. RMSEP is used as the
measure of the average accuracy of the prediction. The accuracy of the model is also
compared on the basis of adjusted R2 in order to regulate the number of model
parameters for the available spectra. The R2, RER and RPD are dimensionless, meaning
that they can be compared on the same basis between models for different
constituents/properties allowing model efficiency assessment. Higher RPD and RER values
suggest more accurate models. Values of RPD and RER less then 3 and 10, respectively,
are an indication of qualitative models; while models with higher values are considered
even to be used in quantitative prediction.
When the selected wavelength region of UV-VIS and NIR were used separately, the PLS
produced slightly better results (R2UV-VIS=0.973, RPDUV-VIS=6.123, RERUV-VIS=22.236)
with RMSE=4.800. For the combined spectral range of UV-VIS and NIR (325-1699 nm) the
PCR model produced better results (R2=0.999, RPD=3.138, RER=13.200 with the
RMSE=11,877). To comment the superiority of one model over another is not an easy
task, because the dimensionless parameters and error(s), RMSE, did not show exactly the
same trend. The R2was higher for the model with higher RMSE. The major difference
between PLSR and PCR was in obtaining the higher number of factors for PCR, which is not
a significant problem.

Item Type: Conference or Workshop Item (Paper)
Subjects: Scientific Fields (Frascati) > Natural sciences > Mathematics
Divisions: Faculty of Technology and Technical Sciences
Depositing User: Mr Jordan Martinovski
Date Deposited: 21 Mar 2023 11:39
Last Modified: 21 Mar 2023 11:39

Actions (login required)

View Item View Item