Peanut oil is a widely loved edible vegetable oil in the world. The raw material of peanut oil may be contaminated by heavy metals in the process of planting, processing, and transportation. In this study, the heavy metal Cd content in peanut oil was rapidly determined by FT-NIRS combined with a machine learning model. Three variable optimization algorithms were used for FT-NIRS obtained by experiments, namely, the variable combination population analysis (VCPA), the iteratively variable subset optimization (IVSO), and the multi-feature spaces ensemble strategy by least absolute shrinkage and selection operator (MFE-LASSO). A partial least squares (PLS) regression model was used to determine Cd content in peanut oil. The experimental findings demonstrate that the three variable selection optimization algorithms are effective in identifying key combinations of variables. Among them, the MFE-LASSO-PLS model has strong robustness and generalization performance. Its root mean square error of prediction (RMSEP) was 3.7753 mg·kg-1, prediction coefficient of determination (RP2) was 0.9675, and the relative prediction deviation (RPD) was 5.5485. The research findings demonstrate that the rapid and accurate determination of Cd in peanut oil can be achieved through the integration of FT-NIRS technology with a machine learning model.
Read full abstract