Abstract

Vibrational spectroscopy has become a valuable tool in many fields as it provides a molecular signature with a non-destructive measurement. Identification and prediction performance of the technique greatly depend on pre-processing steps used to remove unwanted sources of variability, especially for biological matter. However, finding the right combination of pre-processing methods (smoothing, baseline correction and/or normalization) is not a trivial task and usually depends on the operator habits. As testing all possible pre-processing sequences is time consuming, genetic algorithms (GAs) were put forward as a way to quickly find a relatively good sequence. We present here a GA that additionally optimizes the regression model, making the whole data analysis process automated, paving the way to automated machine learning. To make the best of GAs, we determined the optimal GA parameters, based on three datasets of different vibrational spectroscopy modalities (Raman or IR spectra from food industry or biological samples). They depended on the desired quality of the solution, but hardly on the dataset itself, meaning they could be used on new data without further tuning. Our method compares positively with random search, ant colony optimization and tree-structured Parzen Estimator, commonly used in machine learning for tuning hyperparameters. In conclusion, we provide a GA adapted to the simultaneous selection of pre-processing and regression of vibrational spectra.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.