Leaf chlorophyll content (LCC) is a key indicator in representing the photosynthetic capacity of Populus deltoides (Populus deltoides Marshall). Unmanned aerial vehicle (UAV) hyperspectral imagery provides an effective approach for LCC estimation, but the issue of band redundancy significantly impacts model accuracy and computational efficiency. Commonly used single feature selection algorithms not only fail to balance computational efficiency with optimal set search but also struggle to combine different regression algorithms under dynamic set conditions. This study proposes an ensemble feature selection framework to enhance LCC estimation accuracy using UAV hyperspectral data. Firstly, the embedded algorithm was improved by introducing the SHapley Additive exPlanations (SHAP) algorithm into the ranking system. A dynamic ranking strategy was then employed to remove bands in steps of 10, with LCC models developed at each step to identify the initial band subset based on estimation accuracy. Finally, the wrapper algorithm was applied using the initial band subset to search for the optimal band subset and develop the corresponding model. Three regression algorithms including gradient boosting regression trees (GBRT), support vector regression (SVR), and gaussian process regression (GPR) were combined with this framework for LCC estimation. The results indicated that the GBRT-Optimal model developed using 28 bands achieved the best performance with R2 of 0.848, RMSE of 1.454 μg/cm2 and MAE of 1.121 μg/cm2. Compared with a model performance that used all bands as inputs, this optimal model reduced the RMSE value by 24.37%. In addition to estimating biophysical and biochemical parameters, this method is also applicable to other hyperspectral imaging tasks.
Read full abstract