Abstract

<span lang="EN-US">Since December 2019, the world is fighting against coronavirus disease (COVID-19). This disease is caused by a novel coronavirus termed as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). This work focuses on the applications of machine learning algorithms in the context of COVID-19. Firstly, regression analysis is performed to model the number of confirmed cases and death cases. Our experiments show that autoregressive integrated moving average (ARIMA) can reliably model the increase in the number of confirmed cases and can predict future cases. Secondly, a number of classifiers are used to predict whether a COVID-19 patient needs to be admitted to an intensive care unit (ICU) or semi-ICU. For this, classification algorithms are applied to a dataset having 5644 samples. Using this dataset, the most significant attributes are selected using features selection by ExtraTrees classifier, and Proteina C reativa (mg/dL) is found to be the highest-ranked feature. In our experiments, random forest, logistic regression, support vector machine, XGBoost, stacking and voting classifiers are applied to the top 10 selected attributes of the dataset. Results show that random forest and hard voting classifiers achieve the highest classification accuracy values near 98%, and the highest recall value of 98% in predicting the need for admission into ICU/semi ICU units.</span>

Highlights

  • In December 2019, novel coronavirus diseases (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) started in Wuhan, China

  • We evaluated the effectiveness of several classifiers including Random Forest (RF), linear regression (LR), support vector machine (SVM), XGB, stacking 1, stacking 2, AdaBoost, Bagging LR, hard voting and soft voting algorithms in classifying the requirement of intensive care unit (ICU)/semi-ICU or no ICU requirement

  • The model works for the case of number of confirmed cases of two individual countries India and the USA, which are badly affected by the virus. This autoregressive integrated moving average (ARIMA) model can predict the number of confirmed cases in the future

Read more

Summary

Introduction

In December 2019, novel coronavirus diseases (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) started in Wuhan, China. As of 29 November 2020, there have been 62,789,393 confirmed cases and 1,462,086 deaths in more than 218 countries and territories This particular coronavirus is related to Middle East respiratory syndrome coronavirus (MERS-CoV) and severe acute respiratory syndrome coronavirus (SARS-CoV). In order to spread into human cells, SARS-CoV-2 uses angiotensin-converting enzyme 2(ACE2) as a cell receptor [11] Like other viruses, this novel coronavirus has mutated in the last six months or so since December 2019. The effects of COVID-19 can be managed if predictions can be made on the future spread of the disease and the possible requirement of ICU and semi-ICU units. Autoregressive integrated moving average (ARIMA) model is proposed and implemented for future forecasting of COVID-19 cases.

Related Works
Regression Analysis
Data pre-processing
Correlation between features and the target variable
Classification algorithms
Performance metrics
Performance evaluation
Conclusion
Findings
Authors

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.