Abstract
Accurate forecasting of emerging infectious diseases can guide public health officials in making appropriate decisions related to the allocation of public health resources. Due to the exponential spread of the COVID-19 infection worldwide, several computational models for forecasting the transmission and mortality rates of COVID-19 have been proposed in the literature. To accelerate scientific and public health insights into the spread and impact of COVID-19, Google released the Google COVID-19 search trends symptoms open-access dataset. Our objective is to develop 7 and 14-day-ahead forecasting models of COVID-19 transmission and mortality in the US using the Google search trends for COVID-19 related symptoms. Specifically, we propose a stacked long short-term memory (SLSTM) architecture for predicting COVID-19 confirmed and death cases using historical time series data combined with auxiliary time series data from the Google COVID-19 search trends symptoms dataset. Considering the SLSTM networks trained using historical data only as the base models, our base models for 7 and 14-day-ahead forecasting of COVID cases had the mean absolute percentage error (MAPE) values of 6.6% and 8.8%, respectively. On the other side, our proposed models had improved MAPE values of 3.2% and 5.6%, respectively. For 7 and 14 -day-ahead forecasting of COVID-19 deaths, the MAPE values of the base models were 4.8% and 11.4%, while the improved MAPE values of our proposed models were 4.7% and 7.8%, respectively. We found that the Google search trends for “pneumonia,” “shortness of breath,” and “fever” are the most informative search trends for predicting COVID-19 transmission. We also found that the search trends for “hypoxia” and “fever” were the most informative trends for forecasting COVID-19 mortality.
Highlights
In March 1st, 2020, the COVID-19 outbreak was declared a national emergency in the US
For COVID-19 death cases, we found that NJ and NY consistently had the highest number of total deaths and that their curves seemed to be flat starting the third week of July
We report the performance of stacked Long short-term memory (LSTM) (SLSTM) models for predicting COVID-19 cases and mortality k-day-ahead for k equals 7 and 14 days
Summary
In March 1st, 2020, the COVID-19 outbreak was declared a national emergency in the US. After exactly one year of this declaration and according to the JHU dashboard, the numbers of COVID-19 confirmed and death cases have reached more than 500 K and 17 M, respectively. This rapid spread of the virus in the US had negative impacts on several sectors including economy [1], education [2–4], health [5,6]. Used statistical methods for time series forecasting such as autoregressive integrated moving average (ARIMA) [11] have been used in multiple studies for forecasting COVID-19 (e.g., [12–14]) These methods are typically based on historical data and do not account directly for disease transmission dynamics or any relevant biological process [15–17]. Because the time series forecasting task can be formulated as a supervised learning problem [21], several machine learning algorithms have been used for forecasting COVID-19 (e.g., [22–25])
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.