Abstract

Keywords: Pollen, Machine Learning, Allergic Disease Abstract Background and aim High concentrations of airborne pollen trigger seasonal allergies and possibly severe adverse respiratory and cardiovascular health events. Predicting pollen concentration accurately is valuable for epidemiological studies to determine the effects on cardiovascular, respiratory, and cognitive health. We aimed to develop a spatiotemporal land-use regression model for pollen, predicting daily concentrations at a fine spatial resolution of 1x1km across Switzerland between 2003 and 2020. Methods Daily pollen concentrations for hazel, alder, ash, birch, and grasses were available from 14 sites. We considered a range of spatial (elevation, tree type), temporal (date, season, month, week and day of the year, national daily pollen concentration) and spatiotemporal predictors (wind speed, wind direction, temperature, precipitation, relative humidity, satellite-observed Normalized Difference Vegetation Index (NDVI), and land-use (CLC, Landsat satellite) to explain variation in total pollen concentration for five specific pollen species. We applied a range of feature engineering techniques to encode categorical variables (land-use) and fill in missing values (Landsat). We applied a random forest model with 5-fold cross-validation. Results The median grass pollen concentration was 24 pollen/m3 (P5-P95 range 0-187 pollen/m3) during the main grass pollen season (May-July for all years). Preliminary results of a model predicting grass pollen concentration achieved an overall R2 of 0.74 and a root mean squared error (RMSE) of 24.12 pollen/m3 (cross-validation). Temperature, humidity, wind speed, NDVI, Landsat, average national daily pollen concentration and date features were the most important predictors for grass pollen concentration. Conclusions Building upon national observed pollen concentrations and using random forest machine learning, these spatiotemporal pollen models will serve to estimate individual residential pollen exposure. Resulting estimates will enable us to study respiratory and cardiovascular mortality and hospital admissions using historical data from the Swiss National Cohort and the Swiss Federal Office of Statistics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call