Abstract

BackgroundThe recent global outbreak of coronavirus disease (COVID-19) is affecting many countries worldwide. Iran is one of the top 10 most affected countries. Search engines provide useful data from populations, and these data might be useful to analyze epidemics. Utilizing data mining methods on electronic resources’ data might provide a better insight into the COVID-19 outbreak to manage the health crisis in each country and worldwide.ObjectiveThis study aimed to predict the incidence of COVID-19 in Iran.MethodsData were obtained from the Google Trends website. Linear regression and long short-term memory (LSTM) models were used to estimate the number of positive COVID-19 cases. All models were evaluated using 10-fold cross-validation, and root mean square error (RMSE) was used as the performance metric.ResultsThe linear regression model predicted the incidence with an RMSE of 7.562 (SD 6.492). The most effective factors besides previous day incidence included the search frequency of handwashing, hand sanitizer, and antiseptic topics. The RMSE of the LSTM model was 27.187 (SD 20.705).ConclusionsData mining algorithms can be employed to predict trends of outbreaks. This prediction might support policymakers and health care managers to plan and allocate health care resources accordingly.

Highlights

  • A respiratory disease originating from coronavirus occurred in Wuhan City of China

  • Linear regression and long short-term memory (LSTM) models were used to estimate the number of positive COVID-19 cases

  • The linear regression model predicted the incidence with an root mean square error (RMSE) of 7.562 (SD 6.492)

Read more

Summary

Methods

The daily new cases of coronavirus (daily incidence) from February 15, 2020, to March 18, 2020, in Iran were obtained from the Worldometer website [10]. Google Trends [11] was searched for concepts related to COVID-19, from February 10, 2020, to March 18, 2020. Google Trends does not provide absolute search numbers but instead, provides a measure entitled interest over time, which is described as “A value of 100 is the peak popularity for the term. The interest of “Antiseptic selling” search term in Persian for the previous day in Iran [Antiseptic buying]_pd. The interest of “Antiseptic buying” search term in Persian for the previous day in Iran [Hand washing]_pd. The interest of “Handwashing” search term in Persian for the previous day in Iran. Long short-term memory (LSTM) is an artificial recurrent neural network that is an effective model for the prediction of time series where data are sequential [9]. 10-fold cross-validation was used to evaluate the performance of the models, and the root mean square error (RMSE) metric was chosen for performance evaluation:

Results
Conclusions
Introduction
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call