This study proposed a novel approach to automatically select the preprocessing methods and hyperparameters of machine learning (ML) algorithms based on their best performance in cross-validation for near-infrared (NIR) spectroscopy data. The proposed method simultaneously incorporates single or multiple-preprocessing steps and tunes hyperparameters to determine the best model performance for FT-NIR and Micro-NIR spectral data of coconut milk adulteration with distilled water and mature coconut water in the range of 0%–50%. Computational experiments were conducted using nine single preprocessing types, three types of ML classifier (linear discriminant analysis (LDA), k-nearest neighbour (KNN), multilayer perceptron (MLP)) and three types of ML regressor (partial least squares (PLS), KNN, MLP). The proposed performance strategy effectively addressed and produced satisfactory outcomes for classification and regression challenges in coconut milk adulteration. Finally, the results demonstrated that the proposed approach can more accurately determine the best model, particularly for NIR spectroscopy of coconut milk adulteration.
Read full abstract