Abstract

This study aims to develop an assumption-free data-driven model to accurately forecast COVID-19 spread. Towards this end, we firstly employed Bayesian optimization to tune the Gaussian process regression (GPR) hyperparameters to develop an efficient GPR-based model for forecasting the recovered and confirmed COVID-19 cases in two highly impacted countries, India and Brazil. However, machine learning models do not consider the time dependency in the COVID-19 data series. Here, dynamic information has been taken into account to alleviate this limitation by introducing lagged measurements in constructing the investigated machine learning models. Additionally, we assessed the contribution of the incorporated features to the COVID-19 prediction using the Random Forest algorithm. Results reveal that significant improvement can be obtained using the proposed dynamic machine learning models. In addition, the results highlighted the superior performance of the dynamic GPR compared to the other models (i.e., Support vector regression, Boosted trees, Bagged trees, Decision tree, Random Forest, and XGBoost) by achieving an averaged mean absolute percentage error of around 0.1%. Finally, we provided the confidence level of the predicted results based on the dynamic GPR model and showed that the predictions are within the 95% confidence interval. This study presents a promising shallow and simple approach for predicting COVID-19 spread.

Highlights

  • In December 2019, the world was waiting to welcome 2020; Wuhan hospital note unusual Severe Acute Respiratory by a new virus, and it was spread swiftly

  • This work aimed to develop an effective data-driven approach to predict the number of COVID-19 confirmed and recovered cases in India and Brazil, ranked as the second and third countries with the highest number of confirmed cases behind the United States

  • This paper introduces a dynamic Gaussian process regression (GPR) model with optimized hyperparameters via Bayesian optimization into COVID-19 spread forecasting

Read more

Summary

Introduction

In December 2019, the world was waiting to welcome 2020; Wuhan hospital note unusual Severe Acute Respiratory by a new virus, and it was spread swiftly. Results showed the superior detection accuracy of this approach compared to Generative adversarial networks (GAN), Deep Belief Network (DBN), and restricted Boltzmann machine (RBM)-based 1SVM methods This detector is verified using routine blood tests samples from two hospitals in Brazil and Italy; a large dataset is needed to verify the generalization of this approach. ­In4, the authors present a comparative study between eight machine learning models to forecast COVID-19, such as logistic regression, Restricted Boltzmann Machine, convolutional neural networks, and support vector regression(SVR) They used time-series data for confirmed and recovered COVID-19 cases from seven countries, including Brazil, India, and Saudi Arabia, recorded from January 22, 2020, to September 06, 2020. The study i­n30 employed an autoregression model utilizing Poisson distribution called Poisson Autoregression(PAR) to predict the confirmed and recovered cases of COVID-19 in Jakarta Results showed that this approach provides acceptable forecasting accuracy with an MPAE value lesser than 20%. Results reveal that SVR provides the best forecasting when using confirmed COVID-19 cases data from Saudi Arabia, and LR outperforms the other models when using Bahrain confirmed cases data

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call