BACKGROUND AND AIM: Accurate PM2.5 exposure assessment, often performed using statistical prediction models, is a critical component of health studies and regulatory action. Ensemble modeling is becoming increasingly popular as it improves accuracy by combining the unique strengths of different models. Identifying where prediction uncertainty is greatest can inform monitor deployment and model development. We fit an ensemble model integrating multiple existing PM2.5 prediction models, estimated location-specific uncertainty in predictions, and then identified factors contributing to uncertainty. METHODS: We predicted 2015 annual PM.5 concentrations at 0.01°×0.01° resolution across the contiguous US by combining three well-validated prediction models with the Bayesian Non-parametric Ensemble (BNE). Training data came from the US Environmental Protection Agency's Air Quality System database. We estimated model uncertainty, which captures disagreement between models, uncertainty of weights, and random error, as the standard deviations of predictions’ location-specific posterior predictive distribution. We analyzed how predicted PM2.5, AQS monitor count within a 50-km radius, summer- and winter-mean temperature, and population density vary with uncertainty via a generalized additive mixed model, with penalized splines and a random intercept for state. RESULTS:Mean (standard deviation; SD) predicted PM2.5 was 6.37 μg/m3 (1.78), with a spatial RMSE of 0.71 μg/m^3, and mean (SD) uncertainty was 0.47 (0.20) μg/m^3. We observed greater uncertainty in the Midwest and Great Lakes areas. Predicted concentration had a complex relationship with uncertainty, with a generally positive association above ~8 μg/m^3. Monitor density had a negative association. Winter temperature below 0°C was positively associated with uncertainty, and summer temperature was positively associated below 19°C and negatively above 19°C. Very high population density was associated with lower uncertainty. CONCLUSIONS:PM2.5 prediction uncertainty varied across space; uncertainty was greater in areas with more pollution, fewer monitors, colder winters, more moderate summers, and lower population density. Subsequent monitoring and prediction model development should consider prioritizing these areas. KEYWORDS: Exposure assessment, particulate matter, spatial statistics
Read full abstract