High-resolution column-averaged dry air mole fraction of CO2 (XCO2) data is crucial for understanding the spatiotemporal patterns of XCO2 and for mitigating carbon emissions. Due to the limited scanning range of sensors and strict inversion conditions, satellite-retrieved XCO2 data are often significantly incomplete. Machine learning models are widely used to fill these gaps in satellite XCO2 data. However, the limitations of individual machine learning models and the complexity of the spatial distribution of XCO₂ mean that the accuracy of XCO2 predictions still needs improvement. In this study, a new spatiotemporal stacked ensemble learning model (STEL) was developed by combining random forest (RF), extremely randomized trees (ERT), extreme gradient boosting (XGBoost), optical gradient boosting (LightGBM), and categorical boosting (CatBoost) using the stacking ensemble learning methodology. Considering the spatiotemporal heterogeneity of XCO2, a novel spatiotemporal weighting feature was constructed as part of the model's input parameters. Finally, the XCO2 observed by Orbiting Carbon Observatory 2 (OCO-2) was reconstructed using STEL, and a monthly mean XCO2 dataset covering China from 2015 to 2020 was generated at a spatial resolution of 0.1°. The results show that STEL exhibits superior performance and generalization capabilities compared to individual machine-learning models. R2 RMSE and MAPE were 0.9624, 1.0023 ppm, and 0.1583 % on the test set, and 0.8970, 1.4213 ppm, and 0.2475 % for R2, RMSE, and MAPE in ground validation, respectively. In 10-fold cross-validation, STEL's RMSE was reduced by 9.52 % compared to the best-performing single model (RF). The spatiotemporal trend of CO2 in China from 2015 to 2020 was analyzed using STEL XCO2 data. The results indicate that this dataset accurately reflects the spatiotemporal heterogeneity of XCO2 distribution at a fine scale. Overall, XCO2 exhibited a spatiotemporal pattern of “high in the east and low in the west” and “high in spring and low in summer.” Except in summer, high XCO₂ values were mainly distributed in the North China Plain. XCO2 trends and hotspots showed considerable spatial variation. The Pearl River Delta and Yangtze River Delta urban agglomerations have the fastest XCO2 growth rates, and the distribution of XCO2 hotspots is consistent with the distribution of population and economic centers. In the sparsely populated northwest of China, XCO2 is growing rapidly due to increased thermal power generation and coal mining. XCO2 hotspots in Northwest China are mainly located in Xinjiang, Ningxia, and Inner Mongolia. The methodology and data presented are useful for further research on carbon emissions, carbon sinks, and climate change.
Read full abstract