Abstract

This study addresses the challenge of effectively selecting relevant variables and providing interpretable insights in spectroscopic analysis (1100–2498 nm) of corn quality (moisture, fat, protein, and starch), incorporating variable selection techniques and explainable artificial intelligence (AI). Three variable selection algorithms were used to select 36 important variables, and through combinatorial optimization, only 11 common wavelengths were chosen. Partial least squares regression (PLSR) models were developed using the full spectral range, individual variables, and common variables. Based on the root mean square errors of prediction (RMSEP), the PLSR models with common variables outperformed individual feature variables for starch (0.22 % vs. 0.24 %) and were comparable for oil (0.05 % vs. 0.05 %), water (0.02 % vs. 0.01 %) and protein (0.12 % vs. 0.09 %). The SHapley Additive exPlanation (SHAP) method was employed to explain the PLSR model and assess the contribution of common variables. This combined approach significantly improved the accuracy and interpretability of corn prediction models, providing trust in the results and facilitating the understanding of the relationship between spectral features and corn quality attributes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call