Estimating forest growing stock volume (GSV) is crucial for forest growth and resource management, as it reflects forest productivity. National measurements are laborious and costly; however, integrating satellite data such as optical, Synthetic Aperture Radar (SAR), and airborne laser scanning (ALS) with National Forest Inventory (NFI) data and machine learning (ML) methods has transformed forest management. In this study, random forest (RF), support vector regression (SVR), and Extreme Gradient Boosting (XGBoost) were used to predict GSV using Estonian NFI data, Sentinel-2 imagery, and ALS point cloud data. Four variable combinations were tested: CO1 (vegetation indices and LiDAR), CO2 (vegetation indices and individual band reflectance), CO3 (LiDAR and individual band reflectance), and CO4 (a combination of vegetation indices, individual band reflectance, and LiDAR). Across Estonia’s geographical regions, RF consistently delivered the best performance. In the northwest (NW), the RF model achieved the best performance with the CO3 combination, having an R2 of 0.63 and an RMSE of 125.39 m3/plot. In the southwest (SW), the RF model also performed exceptionally well, achieving an R2 of 0.73 and an RMSE of 128.86 m3/plot with the CO4 variable combination. In the northeast (NE), the RF model outperformed other ML models, achieving an R2 of 0.64 and an RMSE of 133.77 m3/plot under the CO4 combination. Finally, in the southeast (SE) region, the best performance was achieved with the CO4 combination, yielding an R2 of 0.70 and an RMSE of 21,120.72 m3/plot. These results underscore RF’s precision in predicting GSV across diverse environments, though refining variable selection and improving tree species data could further enhance accuracy.
Read full abstract