Abstract

The past two decades have demonstrated a great potential for airborne Light Detection and Ranging (LiDAR) data to improve the efficiency of forest resource inventories (FRIs). In order to make efficient use of LiDAR data in FRIs, the data need to be related to observations taken in the field. Various modeling techniques are available that enable a data analyst to establish a link between the two data sources. While the choice for a modeling technique may have negligible effects on point estimates, different model techniques may deliver different estimates of precision. This study investigated the impact of various model and variable selection procedures on estimates of precision. The focus was on LiDAR applications in FRIs. The procedures considered included stepwise variable selection procedures such as the Akaike Information Criterion (AIC), the corrected Akaike Information Criterion (AICc), and the Bayesian (or Schwarz) Information Criterion. Variables have also been selected based on the condition number of the matrix of covariates (i.e., LiDAR metrics) and the variance inflation factor. Other modeling techniques considered in this study were ridge regression, the least absolute shrinkage and selection operator (Lasso), partial least squares regression, and the random forest algorithm. Stepwise variable selection procedures have been considered in both, the (design-based) model-assisted, as well as in the model-based (or model-dependent) inference framework. All other techniques were investigated only for the model-assisted approach. In a comprehensive simulation study, the effects of the different modeling techniques on the precision of population parameter estimates (mean aboveground biomass per hectare) were investigated. Five different datasets were used. Three artificial datasets were simulated; two further datasets were based on FRI data from Canada and Norway. Canonical vine copulas were employed to create synthetic populations from the FRI data. From all populations simple random samples of different size were repeatedly drawn and the mean and variance of the mean were estimated for each sample. While for the model-based approach only a single variance estimator was investigated, for the model-assisted approach three alternative estimators were examined. The results of the simulation studies suggest that blind application of stepwise variable selection procedures lead to overly optimistic estimates of precision in LiDAR-assisted FRIs. The effects were severe for small sample sizes (n = 40 and n = 50). For large samples (n = 400) overestimation of precision was negligible. Good performance in terms of empirical standard errors and coverage rates were obtained for ridge regression, Lasso, and the random forest algorithm. This study concludes that the use of the latter three modeling techniques may prove useful in future LiDAR-assisted FRIs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call