Abstract

Predicting PM2.5 concentrations at a fine spatial and temporal resolution (i.e., neighborhood, hourly) is challenging. Recent growth in low cost sensor networks is providing increased spatial coverage of air quality data that can be used to supplement data provided by monitors of regulatory agencies. We developed an hourly, 500 × 500 m gridded PM2.5 model that integrates PurpleAir low-cost air sensor network data for Los Angeles County. We developed a quality control scheme for PurpleAir data. We included spatially and temporally varying predictors in a random forest model with random oversampling of high concentrations to predict PM2.5. The model achieved high prediction accuracy (10-fold cross-validation (CV) R2 = 0.93, root mean squared error (RMSE) = 3.23 μg/m3; spatial CV R2 = 0.88, spatial RMSE = 4.33 μg/m3; temporal CV R2 = 0.90, temporal RMSE = 3.85 μg/m3). Our model was able to predict spatial and diurnal patterns in PM2.5 on typical weekdays and weekends, as well as non-typical days, such as holidays and wildfire days. The model allows for far more precise estimates of PM2.5 than existing methods based on few sensors. Taking advantage of low-cost PM2.5 sensors, our hourly random forest model predictions can be combined with time-activity diaries in future studies, enabling geographically and temporally fine exposure estimation for specific population groups in studies of acute air pollution health effects and studies of environmental justice issues.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call