Due to its advantages such as speed and noninvasive nature, near-infrared spectroscopy (NIRS) technology has been widely used in detecting the nutritional content of nut food. This study aims to address the problem of offline quantitative analysis models producing unsatisfactory results for different batches of samples due to complex and unquantifiable factors such as storage conditions and origin differences of Korean pine nuts. Based on the offline model, an online learning model was proposed using recursive partial least squares (RPLS) regression with online multiplicative scatter correction (OMSC) preprocessing. This approach enables online updates of the original detection model using a small amount of sample data, thereby improving its generalization ability. The OMSC algorithm reduces the prediction error caused by the inability to perform effective scatter correction on the updated dataset. The uninformative variable elimination (UVE) algorithm appropriately increases the number of selected feature bands during the model updating process to expand the range of potentially relevant features. The final model is iteratively obtained by combining new sample feature data with RPLS. The results show that, after OMSC preprocessing, with the number of features increased to 100, the new online model's R2 value for the prediction set is 0.8945. The root mean square error of prediction (RMSEP) is 3.5964, significantly outperforming the offline model, which yields values of 0.4525 and 24.6543, respectively. This indicates that the online model has dynamic and sustainable characteristics that closely approximate practical detection, and it provides technical references and methodologies for the design and development of detection systems. It also offers an environmentally friendly tool for rapid on-site analysis for nut food regulatory agencies and production enterprises.
Read full abstract