Airborne laser scanning technique (ALS) is the most appealing remote sensing technique for the precise estimation of forest above-ground biomass (AGB). Significantly strong correlations (collinearity) among the independent variables derived from ALS data decrease the accuracy of developed AGB models. To address this issue, we propose a novel variable selection algorithm-an improved sure independence screening (SPV), which integrates the Pearson correlation coefficient, threshold μ, and variance inflation factor. We further compared the performance of SPV-based, stepwise feature selection (SFS)-based, and least absolute shrinkage and selection operator (LASSO)-based AGB models developed with different regression approaches that were sensitive to strong collinearity. Field-measured data and corresponding ALS data, acquired from 1002 sample plots distributed across four distinct forest types within the Guangxi Zhuang Autonomous Region in Southern China, were used to evaluate variable selection techniques and develop AGB models. Results indicated that ALS variables selected by SPV exhibited weaker collinearity compared to those selected by SFS and LASSO. SPV-based AGB models outperformed SFS-based AGB models with higher leave-one-out cross-validation adjusted R2 (LOOCV Radj2) by 0.1% − 27.8%. SPV-based AGB models outperformed LASSO-based AGB models with higher LOOCV Radj2 by 0.4% − 16.3%. Hence, for variable selection in constructing AGB models (linear regression model, log–log regression model, and generalized additive model (GAM)) based on strongly collinear ALS variables, SPV is most preferred, followed by SFS and LASSO. The smooth curves from our GAMs developed using SPV-selected variables revealed that five canopy height variables (whp40, hmean, hp25, hp70, hskew), one canopy density variable (whp85), three density-related variables (nVegPts, bcc, and mcc), and one vertical structural variable (LADmean) were positively correlated with AGB. The canopy height variables (whp40, hmean, hp25, and hp70) were identified as the most important variables in estimating AGB for four forest types. The canopy density variable wdp85 showed a strong effect on AGB of the coniferous forests, whereas it had almost no effect on the AGB of broadleaved forests. Overall, this manuscript proposes a novel variable selection algorithm named SPV, aimed at addressing collinearity of variables derived from ALS data, which has significant implications for the application of ALS in forest inventory and forest modeling.
Read full abstract