Abstract

Nitrogen dioxide (NO2) is a primary constituent of traffic-related air pollution and has well established harmful environmental and human health impacts. Knowledge of the spatiotemporal distribution of NO2 is critical for assessing exposure and subsequent risk assessment. A common approach for assessing exposure to outdoor air pollution is linear regression involving spatially referenced covariates, known as land-use regression (LUR). While LURs have undoubtedly been useful for many exposure and risk assessment studies, the typical assumption of independent errors is usually violated because the spatial dependence in the response cannot be captured fully by the covariates, resulting in biased covariate estimates and decreased sensitivity and specificity in the model-selection process. Here, we develop an approach for simultaneous variable selection and estimation of LUR models with spatiotemporal correlated errors that is feasible with large sample sizes. We employ a general-Vecchia approximation, which is highly accurate and guarantees linear complexity with respect to the sample size. We demonstrate this new approach with spatiotemporal random field simulations and with the case study of daily ground-level NO2 in the United States. The simulations show that our approach results in consistently better prediction as measured by the cross-validation mean squared error (MSE) compared to the competing methods. Additionally, the model selection false positive and negative rates are lower across most simulation scenarios. For NO2, our approach has at least a 10 percent improvement in prediction MSE over all of the competing methods and results in significantly more sparse models. US-wide, daily NO2 predictions and R-code are freely available for use.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call