Abstract

Accurate forecasting of emerging infectious diseases can guide public health officials in making appropriate decisions related to the allocation of public health resources. Due to the exponential spread of the COVID-19 infection worldwide, several computational models for forecasting the transmission and mortality rates of COVID-19 have been proposed in the literature. To accelerate scientific and public health insights into the spread and impact of COVID-19, Google released the Google COVID-19 search trends symptoms open-access dataset. Our objective is to develop 7 and 14-day-ahead forecasting models of COVID-19 transmission and mortality in the US using the Google search trends for COVID-19 related symptoms. Specifically, we propose a stacked long short-term memory (SLSTM) architecture for predicting COVID-19 confirmed and death cases using historical time series data combined with auxiliary time series data from the Google COVID-19 search trends symptoms dataset. Considering the SLSTM networks trained using historical data only as the base models, our base models for 7 and 14-day-ahead forecasting of COVID cases had the mean absolute percentage error (MAPE) values of 6.6% and 8.8%, respectively. On the other side, our proposed models had improved MAPE values of 3.2% and 5.6%, respectively. For 7 and 14 -day-ahead forecasting of COVID-19 deaths, the MAPE values of the base models were 4.8% and 11.4%, while the improved MAPE values of our proposed models were 4.7% and 7.8%, respectively. We found that the Google search trends for “pneumonia,” “shortness of breath,” and “fever” are the most informative search trends for predicting COVID-19 transmission. We also found that the search trends for “hypoxia” and “fever” were the most informative trends for forecasting COVID-19 mortality.

Highlights

  • In March 1st, 2020, the COVID-19 outbreak was declared a national emergency in the US

  • For COVID-19 death cases, we found that NJ and NY consistently had the highest number of total deaths and that their curves seemed to be flat starting the third week of July

  • We report the performance of stacked Long short-term memory (LSTM) (SLSTM) models for predicting COVID-19 cases and mortality k-day-ahead for k equals 7 and 14 days

Read more

Summary

Introduction

In March 1st, 2020, the COVID-19 outbreak was declared a national emergency in the US. After exactly one year of this declaration and according to the JHU dashboard, the numbers of COVID-19 confirmed and death cases have reached more than 500 K and 17 M, respectively. This rapid spread of the virus in the US had negative impacts on several sectors including economy [1], education [2–4], health [5,6]. Used statistical methods for time series forecasting such as autoregressive integrated moving average (ARIMA) [11] have been used in multiple studies for forecasting COVID-19 (e.g., [12–14]) These methods are typically based on historical data and do not account directly for disease transmission dynamics or any relevant biological process [15–17]. Because the time series forecasting task can be formulated as a supervised learning problem [21], several machine learning algorithms have been used for forecasting COVID-19 (e.g., [22–25])

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call