An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection.

Aurelle Tchagna Kouanou,Mendel Patrice Nzogang,Thomas Mih Attia,Adèle Ngo Mouelas,Cyrille Feudjio,Anges Fleurio Djeumo,Christian Tchito Tchapga,Daniel Tchiotsop

doi:10.1155/2021/4733167

Abstract

Methods Our analysis and machine learning algorithm is based on most cited two clinical datasets from the literature: one from San Raffaele Hospital Milan Italia and the other from Hospital Israelita Albert Einstein São Paulo Brasilia. The datasets were processed to select the best features that most influence the target, and it turned out that almost all of them are blood parameters. EDA (Exploratory Data Analysis) methods were applied to the datasets, and a comparative study of supervised machine learning models was done, after which the support vector machine (SVM) was selected as the one with the best performance. Results SVM being the best performant is used as our proposed supervised machine learning algorithm. An accuracy of 99.29%, sensitivity of 92.79%, and specificity of 100% were obtained with the dataset from Kaggle (https://www.kaggle.com/einsteindata4u/covid19) after applying optimization to SVM. The same procedure and work were performed with the dataset taken from San Raffaele Hospital (https://zenodo.org/record/3886927#.YIluB5AzbMV). Once more, the SVM presented the best performance among other machine learning algorithms, and 92.86%, 93.55%, and 90.91% for accuracy, sensitivity, and specificity, respectively, were obtained. Conclusion The obtained results, when compared with others from the literature based on these same datasets, are superior, leading us to conclude that our proposed solution is reliable for the COVID-19 diagnosis.

Highlights

Introduction e novel coronavirus known asSARS-CoV-2 (Severe Acute Respiratory Syndrome), responsible for COVID-19 pandemic, belongs to the large family of coronaviruses that cause fever, cough, dyspnea, and muscle pain, while imaging frequently reveals bilateral pneumonia [1,2,3]
Due to the constant shortage of PCR test reagents, which are the tests for COVID-19 by excellence, several medical centers have opted for immunological tests to look for the presence of antibodies produced against this virus
We proposed a solution based on Data Analysis and Machine Learning to detect COVID-19 infections

Summary

Related Works

Several works based on AI, along with ML and DL, have been carried out over the last two years in the context of diagnosis and detection of COVID-19 infections. In 2021, AlJame et al [31] used routine blood tests and proposed an ensemble learning model for COVID-19 diagnosis. For data preparation, they exploited a K-Nearest Neighbors algorithm to deal with null values in the dataset and an isolation forest method to remove outlier data. By using random forest (RF) as their best ML algorithm, they achieved a good result (accuracy 0.88, F1–score 0.76, sensitivity 0.66, specificity 0.91, and AUROC 0.86). Ey found that COVID-19 patients can be divided into subtypes based on the serum levels of immune cells, gender, and reported symptoms They trained an XGBoost model that can distinguish COVID-19 patients from influenza patients with a sensitivity of 92.5% and a specificity of 97.9%. We optimize the SVM algorithm to have a performance superior to all algorithms found in the literature using the same datasets

Proposed Approach

Exploratory Data Analysis

Evaluation

Results

Optimization Results of the Best Model

Discussions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Healthcare Engineering	Publication Date: Nov 22, 2021
Citations: 14	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Healthcare Engineering

Lead the way for us

Similar Papers

Machine Learning Algorithms for Optical Remote Sensing Data Classification and Analysis
G. P. Obi Reddy ... K. C. Arun Kumar
-
G. P. Obi Reddy, et. al.G. P. Obi Reddy ... K. C. Arun Kumar
12 Oct 2021
12 Oct 2021

Are Machine Learning Algorithms More Accurate in Predicting Vegetable and Fruit Consumption Than Traditional Statistical Models? An Exploratory Analysis
Mélina Côté ... Simone Lemieux
Frontiers in Nutrition | VOL. 9
Mélina Côté, et. al.Mélina Côté ... Simone Lemieux
17 Feb 2022
Frontiers in Nutrition | VOL. 9

On Leveraging Machine Learning in Sport Science in the Hypothetico-deductive Framework.
Jordan Rodu ... Jay Hertel
Sports medicine - open | VOL. 10
Jordan Rodu, et. al.Jordan Rodu ... Jay Hertel
14 Nov 2024
Sports medicine - open | VOL. 10

Comparison of machine learning algorithms for clinical event prediction (risk of coronary heart disease).
Juan-Jose Beunza ... Cristian Hurtado
Journal of Biomedical Informatics | VOL. 97
Juan-Jose Beunza, et. al.Juan-Jose Beunza ... Cristian Hurtado
30 Jul 2019
Journal of Biomedical Informatics | VOL. 97

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Healthcare Engineering