Nitrogen content is one of the essential elements in citrus leaves (CL), and many studies have been conducted to determine the nutrient content in CL using hyperspectral technology. To address the key problem that the conventional spectral data-denoising algorithms directly discard high-frequency signals, resulting in missing effective signals, this study proposes a denoising preprocessing algorithm, complete ensemble empirical mode decomposition with adaptive noise joint sparse representation (CEEMDAN–SR), for CL hyperspectral data. For this purpose, 225 sets of fresh CL were collected at the Institute of Fruit Tree Research of the Guangdong Academy of Agricultural Sciences, to measure their elemental nitrogen content and the corresponding hyperspectral data. First, the spectral data were preprocessed using CEEMDAN–SR, Stein’s unbiased risk estimate and the linear expansion of thresholds (SURE–LET), sparse representation (SR), Savitzky–Golay (SG), and the first derivative (FD). Second, feature extraction was carried out using principal component analysis (PCA), uninformative variables elimination (UVE), and the competitive adaptive re-weighted sampling (CARS) algorithm. Finally, partial least squares regression (PLSR), support vector regression (SVR), random forest (RF), and Gaussian process regression (GPR) were used to construct a CL nitrogen prediction model. The results showed that most of the prediction models preprocessed using the CEEMDAN–SR algorithm had better accuracy and robustness. The prediction models based on CEEMDAN–SR preprocessing, PCA feature extraction, and GPR modeling had an R2 of 0.944, NRMSE of 0.057, and RPD of 4.219. The study showed that the CEEMDAN–SR algorithm can be effectively used to denoise CL hyperspectral data and reduce the loss of effective information. The prediction model using the CEEMDAN–SR+PCA+GPR algorithm could accurately obtain the nitrogen content of CL and provide a reference for the accurate fertilization of citrus trees.
Read full abstract