Improving the precision of remote sensing estimation and implementing the fusion and analysis of multi-source data are crucial for accurately estimating the aboveground carbon storage in forests. Using the Google Earth Engine (GEE) platform in conjunction with national forest resource inventory data and Landsat 8 multispectral remote sensing imagery, this research applies four machine learning algorithms available on the GEE platform: Random Forest (RF), Classification and Regression Trees (CART), Gradient Boosting Trees (GBT), and Support Vector Machine (SVM). Using these algorithms, the entire Yunnan Province is classified into seven categories, including broadleaf forest, coniferous forest, mixed broadleaf-coniferous forest, water bodies, built-up areas, cultivated land, and other types. After a thorough comparison, the research reveals that the RF algorithm surpasses others in terms of accuracy and reliability, making it the most suitable choice for estimating aboveground carbon storage in forests using remote sensing data. Therefore, the study used the RF algorithm for both forest classification and the estimation of carbon storage. By extracting remote sensing factors; by using the Pearson correlation coefficient to select the most relevant factors; and by utilizing multiple linear regression, RF regression, and decision tree regression, a model for estimating aboveground carbon stocks in forests was developed. The results indicate that among the four classification algorithms, the RF classifier demonstrates superior performance, with an overall accuracy of 84.96% and a Kappa coefficient of 76.46%. In the RF regression models, the R2 values for the coniferous forest, broadleaf forest, and mixed needle-broadleaf forest models are 0.636, 0.663, and 0.638, respectively. In both RF and CART, the R2 values for the three forest-type models are greater than 0.6, indicating satisfactory model fitting performance. This study aims to explore the possibility of improving the estimation of forest carbon stocks in large-scale areas through fine land use classification. Additionally, the data sources used are completely free, and medium to low resolution can provide a better reference value for practical applications, thereby reducing the cost of utilization.
Read full abstract