Estimating hourly and continuous ground-level PM2.5 concentrations using an ensemble learning algorithm: The ST-stacking model

Luwei Feng,Yiyan Li,Yumiao Wang,Qingyun Du

doi:10.1016/j.atmosenv.2019.117242

Abstract

Estimation of hourly and continuous ground-level fine particulate matter (PM2.5) concentrations is essential for PM2.5 pollution sources identifications, targeted policy development and population exposure research. However, current PM2.5 estimation studies rely heavily on satellite-based aerosol optical depth (AOD) data, and the limited transit times of polar-orbiting satellites such as Terra and Aqua, nighttime gaps in data from geostationary satellites such as Himawari-8, and cloud contamination reported for both types of satellites challenge the estimation of spatiotemporally continuous PM2.5 concentrations. In this study, spatiotemporal PM2.5 characteristic was constructed by the spatiotemporal fusion method. Specifically, multi-source data, including spatiotemporal, periodic, meteorological, vegetation, anthropogenic and topological characteristics, were incorporated into an ensemble learning method that combined extreme gradient boosting (XGBoost), k-nearest neighbour (KNN) and back-propagation neural network (BPNN) algorithms in level 1 and used linear regression (LR) for integration in level 2. The optimized stacking strategy that considered PM2.5 spatiotemporal autocorrelation was called the ST-stacking model. The model was trained, validated and tested with data acquired for China in 2017. The ST-stacking model outperformed XGBoost, KNN and BPNN models by 9.27% on average, with an R2 = 0.9191. Using the model, the 24-h and continuous ground-level PM2.5 concentrations in mainland China on 11 May 2017 were mapped, and parts of Beijing and Chengdu were selected for more detailed analysis. The PM2.5 concentrations in Taklimakan Desert, North China Plain, Sichuan Basin and Yangtze Plain were much higher than those in other locations on this day, which was generally consistent with the long-term patterns reported in previous studies.

Full Text