Gridded Post-Processing Air Quality Predictions based of the Community Multi-scale Air Quality (CMAQ) Model

Jared Lee,Rajesh Kumar,Ju Hye Kim,Irina Djalalova,Scott Meech,James Wilczak,Stefano Alessandrini

doi:10.5194/egusphere-egu24-6190

Abstract

We present the outcomes of our 2-year endeavor as part of the NOAA Joint Technology Transfer Initiative (JTTI). This project aimed to enhance the operational air quality forecasts over the United States generated by the National Air Quality Forecasting Capability (NAQFC) at NOAA/NCEP. We focused on applying machine learning (ML) post-processing techniques to refine forecasts from the Community Multi-scale Air Quality (CMAQ) model.In particular, our efforts concentrated on extending the analog ensemble (AnEn) model, currently utilized at NAQFC, from its existing point-based application to encompass 2D gridded predictions. This approach, known for its success in weather prediction systems for various meteorological parameters, has also been applied in predicting ozone and fine particulate matter (PM2.5) concentrations at surface monitoring sites within the Environmental Protection Agency (EPA) AirNow network.&#160;The AnEn methodology effectively mitigates systematic and random errors present in CMAQ model forecasts, as highlighted in previous studies (Djalalova et al., 2015; Delle Monache et al., 2020). Furthermore, the AnEn method has demonstrated its proficiency in providing accurate and dependable probabilistic wind speed predictions (Alessandrini et al., 2019).&#160;The foundation of the AnEn technique relies on a training dataset comprised of predictions from the CMAQ model and corresponding observational data for the specific quantity of interest (e.g., O3 or PM2.5). This dataset is used to generate ensemble predictions for future time points based on historical observations. The ensemble construction involves selecting past CMAQ forecasts (referred to as analogs) that best match the current deterministic CMAQ forecast. This matching process considers variables including the pollutant concentration and correlated meteorological parameters such as wind, temperature, and relative humidity.&#160;Our study involved an initial application of the AnEn technique to correct CMAQ PM2.5 and ozone surface-gridded concentrations. This was accomplished by combining historical gridded chemical reanalysis data from the Copernicus Atmosphere Monitoring Service (CAMS) Near-Real-Time model with measurements obtained from AirNow monitoring stations. The CAMS analysis integrates satellite-derived data on various atmospheric components and is employed with the ECMWF's Integrated Forecasting System (IFS). The resulting 2D gridded CAMS analysis fields are produced every 12 hours with a spatial resolution of approximately 40 km.The AnEn method necessitates a continuous training dataset comprising hourly observed chemical concentrations. We utilized each 12-hour CAMS analysis along with the subsequent 11 forecast hours to fulfill this requirement, creating a consistent sequence of hourly gridded observations or pseudo-observations. Through the Satellite-Enhanced Data Interpolation technique (SEDI) (Dinku et al., 2015), we merged CAMS surface PM2.5 and ozone fields with corresponding observations from the AirNow network. This technique corrects biases present in CAMS data and short-term forecasts while retaining the accuracy of AirNow measurements at their respective station locations.&#160;Our presentation encompasses multiple phases of validation and verification. Initially, we validated the SEDI-corrected CAMS concentrations against AirNow PM2.5 and ozone measurements obtained from stations not part of the SEDI correction process. Subsequently, we assess the performance of the entire forecasting system over the contiguous United States within the 0-72 hour lead time range. This evaluation employs standardized verification metrics applicable to both deterministic and probabilistic forecasts.

Full Text