Abstract

The objective of the current study was to compare two machine learning approaches for the quantification of total polyphenols by choosing the optimal spectral intervals utilizing the synergy interval partial least squares (Si-PLS) model. To increase the resilience of built models, the genetic algorithm (GA) and competitive adaptive reweighted sampling (CARS) were applied to a subset of variables. The collected spectral data were divided into 19 sub-interval selections totaling 246 variables, yielding the lowest root mean square error of cross-validation (RMSECV). The performance of the model was evaluated using the correlation coefficient for calibration (RC ), prediction (RP ), RMSECV, root mean square error of prediction (RMSEP) and residual predictive deviation (RPD) value. The Si-GA-PLS model produced the following results: PCs = 9; RC = 0.915; RMSECV = 1.39; RP = 0.8878; RMSEP = 1.62; and RPD = 2.32. The performance of the Si-CARS-PLS model was noted to be best at PCs = 10, while RC = 0.9723, RMSECV = 0.81, RP = 0.9114, RMSEP = 1.45 and RPD = 2.59. The build model's prediction ability was amended in the order PLS < Si-PLS < CARS-PLS when full spectroscopic data were used and Si-PLS < Si-GA-PLS < Si-CARS-PLS when interval selection was performed with the Si-PLS model. Finally, the developed method was successfully used to quantify total polyphenols in tea. © 2023 Society of Chemical Industry.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call