Abstract

AbstractHealth and environmental hazards related to high pollution concentrations have become a serious issue from public policy perspectives and human health. Using Machine Learning (ML) approach, this research aims to improve the estimation of grid‐wise PM2.5, a criteria pollutant, by reducing systematic bias from speciation provided by MERRA‐from the Modern‐Era Retrospective analysis for Research and Applications version 2 (MERRA‐2). The ML model was trained using various meteorological parameters and aerosol species simulated by MERRA‐2 and ground measurements from Environmental Protection Agency (EPA) air quality system stations. The ML approach significantly improved performance and reduced mean bias in the 0–10 μg m−3 range. We also used the Random Forest ML model for each EPA region using 1 year of collocated data sets. The resulting ML models for each EPA region were validated, and the aggregate data set has a Spearman Rank correlation (SR) of 0.73 (RMSE = 4.8 μg m−3) and 0.69 (RMSE = 5.8 μg m−3) for training and testing, respectively. The SR (and RMSE in μg m−3) increased to 0.81 (3.9), 0.89 (1.6), and 0.90 (1.1) for daily, monthly, and yearly averages, respectively. The results from the initial implementation of the ML model for the global region are encouraging. Still, they require more research and development to overcome challenges associated with data gaps in many parts of the world.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call