In the field of quantitative remote sensing of forest biomass, a prominent phenomenon is the increasing number of explanatory variables. Then how to effectively select explanatory variables has become an important issue. Linear regression model is one of the commonly used remote sensing models. In the process of establishing the linear regression model, a vital step is to select explanatory variables. Focusing on variable selection and model stability, this paper conducts a comparative study on the performance of eight linear regression parameter estimation methods (Stepwise Regression Method (SR), Criterions Based on The Bayes Method (BIC), Criterions Based on The Bayes Method (AIC), Criterions Based on Prediction Error (Cp), Least Absolute Shrinkage and Selection Operator (Lasso), Adaptive Lasso, Smoothly Clipped Absolute Deviation (SCAD), Non-negative garrote (NNG)) in the subtropical forest biomass remote sensing model development. For the purpose of comparison, OLS and RR, are commonly used as methods with no variable selection ability, and are also compared and discussed. The performance of five aspects are evaluated in this paper: (i) Determination coefficient, prediction error, model error, etc., (ii) significance test about the difference between determination coefficients, (iii) parameter stability, (iv) variable selection stability and (v) variable selection ability of the methods. All the results are obtained through a five ten-fold CV. Some evaluation indexes are calculated with or without degrees of freedom. The results show that BIC performs best in comprehensive evaluation, while NNG, Cp and AIC perform poorly as a whole. Other methods show a great difference in the performance on each index. SR has a strong capability in variable selection, although it is poor in commonly used indexes. The short-wave infrared band and the texture features derived from it are selected most frequently by various methods, indicating that these variables play an important role in forest biomass estimation. Some of the conclusions in this paper are likely to change as the study object changes. The ultimate goal of this paper is to introduce various model establishment methods with variable selection capability, so that we can have more choices when establishing similar models, and we can know how to select the most appropriate and effective method for specific problems.
Read full abstract