Abstract

BackgroundIncreasingly ensemble learning-based spatiotemporal models are being used to estimate residential air pollution exposures in epidemiological studies. While these machine learning models typically have improved performance, they suffer from exposure measurement error that is inherent in all models. Our objective is to develop a framework to formally assess shared, multiplicative measurement error (SMME) in our previously published three-stage, ensemble learning-based nitrogen oxides (NOx) model to identify its spatial and temporal patterns and predictors. MethodsBy treating the ensembles as an external dosimetry system, we quantified shared and unshared, multiplicative and additive (SUMA) measurement error components in our exposure model. We used generalized additive models (GAMs) with a smooth term for location to identify geographic locations with significantly elevated SMME and explain their spatial and temporal determinants. ResultsWe found evidence of significant shared and unshared multiplicative error (p < 0.0001) in our ensemble-learning based spatiotemporal NOx model predictions. Unshared multiplicative error was 26 times larger than SMME. We observed significant geographic (p < 0.0001) and temporal variation in SMME with the majority (43%) of predictions with elevated SMME occurring in the earliest time-period (1992–2000). Densely populated urban prediction regions with complex air pollution sources generally exhibited highest odds of elevated SMME. ConclusionsWe developed a novel statistical framework to formally evaluate the magnitude and drivers of SMME in ensemble learning-based exposure models. Our framework can be used to inform building future improved exposure models.

Highlights

  • Exposure to traffic-related air pollution (TRAP) has repeatedly been associated with mortality and adverse health outcomes, including respiratory illnesses and cardiovascular disease, in large epidemiological cohort studies of children and adults (Zhang et al, 2002; Andersen et al, 2008; Gehring et al, 2010; Esposito et al, 2014; Ryan et al, 2005; Nordling et al, 2008; Chen et al, 2015; Rancière et al, 2017; Pollution HEIPotHEoT-RA, 2010)

  • Stage 3 of the model uses the averaged stage 2 nitrogen oxides (NOx) estimates and constrains the parameter estimates of the temporal basis functions to re-predict exposure based on physical constraints meant to mimic known or observed real-life behavior of NOx

  • By examining temporal trends in SMME in Long Beach (Table 7), we found that the greatest proportion of NOx predictions with high SMME were observed in the cooler months of winter (39.5%) and fall (35.8%) and the majority of low SMME predictions were observed in the spring (27.5%) and summer (28.9%)

Read more

Summary

Introduction

Exposure to traffic-related air pollution (TRAP) has repeatedly been associated with mortality and adverse health outcomes, including respiratory illnesses and cardiovascular disease, in large epidemiological cohort studies of children and adults (Zhang et al, 2002; Andersen et al, 2008; Gehring et al, 2010; Esposito et al, 2014; Ryan et al, 2005; Nordling et al, 2008; Chen et al, 2015; Rancière et al, 2017; Pollution HEIPotHEoT-RA, 2010). Sophisticated spatiotemporal exposure models that incorporate machine learning techniques are increasingly being developed to more accurately predict residential TRAP exposures (and other complex spatially and temporally varying exposures) (Li et al, 2017; Russo and Soares, 2014; Di et al, 2016), given that ‘gold standard’ personal monitoring to capture ‘true exposure’ is often not feasible in large cohort studies. Spatial and temporal uncertainties inherent in these exposure models result in a complex correlation structure which leads to error in exposure predictions, referred to as exposure measurement error These errors can be categorized as independent (unshared) or dependent (shared). Ensemble learning-based spatiotemporal models are being used to estimate residential air pollution exposures in epidemiological studies While these machine learning models typically have improved performance, they suffer from exposure measurement error that is inherent in all models. Our objective is to develop a framework to formally assess shared, multiplicative measurement error (SMME) in our previously published three-stage, ensemble learning-based nitrogen oxides (NOx) model to identify its spatial and temporal patterns and predictors

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call