Abstract
Accurate estimation of surface PM2.5 concentration is critical for the assessment of PM2.5 exposure and associated health impacts. Due to the limited spatial coverage of ground monitoring stations, most studies often use the satellite products to estimate surface PM2.5 concentration by constructing a comprehensive relationship between satellite-retrieved aerosol optical depth (AOD) and ground-based measured PM2.5 concentration with machine learning (ML) technologies. However, uncertainties of ML-based models may lead to considerable biases in PM2.5 estimation, which need carefully examined. Here we evaluate the accuracy of estimated PM2.5 concentration from two popular ML-models (i.e., Random Forest and the BP Neural Network) which were trained and tested using hourly data of satellite-retrieved AOD from HIMAWARI, ground-based measured PM2.5 from China National Environmental Monitoring Center, ERA5 meteorological conditions, and other auxiliary variables for a whole year of 2017 over China. We propose a new validation method considering the spatial pattern of the data during the validation. The results suggest that the traditional validation methods may overestimate the performance of the models on estimating the PM2.5 at the area with sparse in-situ measurements. Moreover, the spatial distribution pattern of the training data will largely affect the evaluation of models performance, which should be carefully considered. For future study, at least a site-specifically validation is needed rather than only using random sampling validation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.