The mixing of cotton seeds of different cultivars and qualities can lead to differences in growth conditions and make field management difficult. In particular, except for yield loss, it can also lead to inconsistent cotton quality and poor textile product quality, causing huge economic losses to farmers and the cotton processing industry. However, traditional cultivar identification methods for cotton seeds are time-consuming, labor-intensive, and cumbersome, which cannot meet the needs of modern agriculture and modern cotton processing industry. Therefore, there is an urgent need for a fast, accurate, and non-destructive method for identifying cotton seed cultivars. In this study, hyperspectral images (397.32 nm-1003.58 nm) of five cotton cultivars, namely Jinke 20, Jinke 21, Xinluzao 64, Xinluzao 74, and Zhongmiansuo 5, were captured using a Specim IQ camera, and then the average spectral information of seeds of each cultivar was used for spectral analysis, aiming to estab-lish a cotton seed cultivar identification model. Due to the presence of many obvious noises in the < 400 nm and > 1000 nm regions of the collected spectral data, spectra from 400 nm to 1000 nm were selected as the representative spectra of the seed samples. Then, various denoising techniques, including Savitzky-Golay (SG), Standard Normal Variate (SNV), and First Derivative (FD), were applied individually and in combination to improve the quality of the spectra. Additionally, a successive projections algorithm (SPA) was employed for spectral feature selection. Based on the full-band spectra, a Partial Least Squares-Discriminant Analysis (PLS-DA) model was established. Furthermore, spectral features and textural features were fused to create Random Forest (RF), Convolutional Neural Network (CNN), and Extreme Learning Machine (ELM) identification models. The results showed that: (1) The SNV-FD preprocessing method showed the optimal denoising performance. (2) SPA highlighted the near-infrared region (800-1000 nm), red region (620-700 nm), and blue-green region (420-570 nm) for identifying cotton cultivar. (3) The fusion of spectral features and textural features did not consistently improve the accuracy of all modeling strategies, suggesting the need for further research on appropriate modeling strategies. (4) The ELM model had the highest cotton cultivar identification accuracy, with an accuracy of 100% for the training set and 98.89% for the test set. In conclusion, this study successfully developed a highly accurate cotton seed cultivar identification model (ELM model). This study provides a new method for the rapid and non-destructive identification of cotton seed cultivars, which will help ensure the cultivar consistency of seeds used in cotton planting, and improve the overall quality and yield of cotton.
Read full abstract