The accuracy of digital elevation models (DEMs) in forested areas plays a crucial role in canopy height monitoring and ecological sensitivity analysis. Despite extensive research on DEMs in recent years, significant errors still exist in forested areas due to factors such as canopy occlusion, terrain complexity, and limited penetration, posing challenges for subsequent analyses based on DEMs. Therefore, a CNN-LightGBM hybrid model is proposed in this paper, with four different types of forests (tropical rainforest, coniferous forest, mixed coniferous and broad-leaved forest, and broad-leaved forest) selected as study sites to validate the performance of the hybrid model in correcting COP30DEM in different forest area DEMs. In the hybrid model of this paper, the choice was made to use the Densenet architecture of CNN models with LightGBM as the primary model. This choice is based on LightGBM's leaf-growth strategy and histogram linking methods, which are effective in reducing the data's memory footprint and utilising more of the data without sacrificing speed. The study uses elevation values from ICESat-2 as ground truth, covering several parameters including COP30DEM, canopy height, forest coverage, slope, terrain roughness and relief amplitude. To validate the superiority of the CNN-LightGBM hybrid model in DEMs correction compared to other models, a test of LightGBM model, CNN-SVR model, and SVR model is conducted within the same sample space. To prevent issues such as overfitting or underfitting during model training, although common meta-heuristic optimisation algorithms can alleviate these problems to a certain extent, they still have some shortcomings. To overcome these shortcomings, this paper cites an improved SSA search algorithm that incorporates the ingestion strategy of the FA algorithm to increase the diversity of solutions and global search capability, the Firefly Algorithm-based Sparrow Search Optimization Algorithm (FA-SSA algorithm) is introduced. By comparing multiple models and validating the data with an airborne LiDAR reference dataset, the results show that the R2 (R-Square) of the CNN-LightGBM model improves by more than 0.05 compared to the other models, and performs better in the experiments. The FA-SSA-CNN-LightGBM model has the highest accuracy, with an RMSE of 1.09 meters, and a reduction of more than 30% of the RMSE when compared to the LightGBM and other hybrid models. Compared to other forested area DEMs (such as FABDEM and GEDI), its accuracy is improved by more than 50%, and the performance is significantly better than other commonly used DEMs in forested areas, indicating the feasibility of this method in correcting elevation errors in forested area DEMs and its significant importance in advancing global topographic mapping.
Read full abstract