Abstract

Relatively few epidemiological studies have utilized Random Forests (RF), possibly because the time series data often encountered in this discipline are perceived as unsuitable for supervised learning methods. We show RF can be used for such data, and demonstrate an example examining which social activities influence pertussis. Results are compared with regression with ARIMA errors modelling. Pertussis continues to be perceived as a childhood condition, despite recent incidence increases in older ages. COVID-19 provided a unique situation; social restrictions were implemented and the number of pertussis cases declined. This meant the influence of different activities on transmission could be gauged. Data detailing restrictions was used from the Oxford 'COVID-19 Government Response Tracker' (OxCGRT). The number of cases of pertussis and OxGCRT variables were lagged then embedded into a matrix, before being fitted into a RF regression model. Based on VIMP, this identified ‘international travel’ ‘public events’ and ‘workplace’ as the most important variables, suggesting adult based activities may be of most importance. An ARIMA(1,0,1), using OxCGRT categories as external regressors, similarly indicated that adult social activities better accounted for the number of cases of pertussis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call