Abstract

AbstractNear infrared (NIR) spectroscopy is an efficient, low‐cost analytical technique widely applied to identify the origin of food and pharmaceutical products. NIR spectra‐based classification strategies typically use thousands of equally spaced wavelengths as input information, some of which may not carry relevant information for product classification. When that is the case, the performance of predictive and exploratory multivariate techniques may be undermined by such noisy information. In this paper, we propose an iterative framework for selecting subsets of NIR wavelengths aimed at classifying samples into categories. For that matter, we integrate Principal Components Analysis (PCA) and three classification techniques: k‐Nearest Neighbor (KNN), Probabilistic Neural Network (PNN) and Linear Discriminant Analysis (LDA). PCA is first applied to NIR data, and a wavelength importance index is derived based on the PCA loadings. Samples are then categorized using the wavelength with the highest index and the classification accuracy is calculated; next, the wavelength with the second highest index is inserted into the dataset and a new classification is performed. This forward‐based iterative procedure is carried out until all original wavelengths are inserted into the dataset used for classification. The subset of wavelengths leading to the maximum accuracy is chosen as the recommended subset. Our propositions performed remarkably well when applied to four datasets related to food and pharmaceutical products. Copyright © 2016 John Wiley & Sons, Ltd.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.