Abstract

Laser-induced fluorescence spectroscopy, Raman scattering, and partial least squares regression models were optimized for the quantification of samarium (0-150 μg mL-1), europium (0-75 μg mL-1), and lithium chloride (0.1-12 M) with a transformational preprocessing strategy. Selecting combinations of preprocessing methods to optimize the prediction performance of regression models is frequently a major bottleneck for chemometric analysis. Here, we propose an optimization tool using an innovative combination of optimal experimental designs for selecting preprocessing transformation and a genetic algorithm (GA) for feature selection. A D-optimal design containing 26 samples (i.e., combinations of preprocessing strategies) and a user-defined design (576 samples) did not statistically lower the root mean square error of the prediction (RMSEP). The greatest improvement in prediction performance was achieved when a GA was used for feature selection. This feature selection greatly lowered RMSEP statistics by an average of 53%, resulting in the top models with percent RMSEP values of 0.91, 3.5, and 2.1% for Sm(III), Eu(III), and LiCl, respectively. These results indicate that preprocessing corrections (e.g., scatter, scaling, noise, and baseline) alone cannot realize the optimal regression model; feature selection is a more crucial aspect to consider. This unique approach provides a powerful tool for approaching the true optimum prediction performance and can be applied to numerous fields of spectroscopy and chemometrics to rapidly construct models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call