Abstract

Environmental epidemiology studies require models of multiple exposures to adjust for co-exposure and explore interactions. We estimated spatiotemporal exposure to surface air temperature and pollution (PM2.5, PM10, NO2, O3) at high spatiotemporal resolution (daily, 250 m) for 2018–2020 in Catalonia. Innovations include the use of TROPOMI products, a data split for remote sensing gap-filling evaluation, estimation of prediction uncertainty, and use of explainable machine learning.We compiled meteorological and air quality station measurements, climate and atmospheric composition reanalyses, remote sensing products, and other spatiotemporal data. We performed gap-filling of remotely-sensed products using Random Forest (RF) models and validated them using Out-Of-Bag (OOB) samples and a structured data split. The exposure modelling workflow consisted of: 1) PM2.5 station imputation with PM10 data; 2) quantile RF (QRF) model fitting; and 3) geostatistical residual spatial interpolation. Prediction uncertainty was estimated using QRF. SHAP values were used to examine variable importance and the fitted relationships. Model performance was assessed via nested CV at the station level.Evaluation of the gap-filling models using the structured split showed error underestimation when using OOB. Temperature models had the best performance (R2 =0.98) followed by the gaseous air pollutants (R2 =0.81 for NO2 and 0.86 for O3), while the performance of the PM2.5 and PM10 models was lower (R2 =0.57 and 0.63 respectively). Predicted exposure patterns captured urban heat island effects, dust advection events, and NO2 hotspots. SHAP values estimated a high importance of TROPOMI tropospheric NO2 columns in PM and NO2 models, and confirmed that the fitted associations conformed to prior knowledge.Our work highlights the importance of correctly validating gap-filling models and the potential of TROPOMI measurements. Moderate performance in PM models can be partly explained by the poor station coverage. Our exposure estimates can be used in epidemiological studies potentially accounting for exposure uncertainty.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.