Abstract

IntroductionFrom public health perspectives of COVID-19 pandemic, accurate estimates of infection severity of individuals are extremely valuable for the informed decision-making and targeted response to an emerging pandemic. This paper presents machine learning based prognostic model for providing early warning to the individuals for COVID-19 infection using the healthcare dataset. In the present work, a prognostic model using random forest classifier and support vector regression is developed for predicting the infection susceptibility probability (ISP) score of COVID-19, and it is applied on an open healthcare dataset containing 27 field values. The typical fields of the healthcare dataset include basic personal details such as age, gender, number of children in the household, and marital status along with medical data like coma score, pulmonary score, blood glucose level, HDL cholesterol, etc. An effective preprocessing method is carried out for handling the numerical and categorical values (non-numerical) and missing data in the healthcare dataset. The correlation between the variables in the healthcare data is analyzed using the correlation coefficient, and heat map with a color code is used to identify the influencing factors on the infection susceptibility probability (ISP) score of COVID-19. Based on the accuracy, precision, sensitivity, and F-scores, it is noted that the random forest classifier provides an improved classification performance as compared to support vector regression for the given healthcare dataset. Android-based mobile application software platform is developed using the proposed prognostic approach for enabling the healthy individuals to predict the susceptibility infection score of COVID-19 to take the precautionary measures. Based on the results of the proposed method, clinicians and government officials can focus on the highly susceptible people for limiting the pandemic spread.MethodsIn the present work, random forest classifier and support vector regression techniques are applied to a medical healthcare dataset containing 27 variables for predicting the susceptibility score of an individual towards COVID-19 infection, and the accuracy of prediction is compared. An effective preprocessing is carried for handling the missing data in the healthcare dataset. Correlation analysis using heat map is carried on the healthcare data for analyzing the influencing factors of infection susceptibility probability (ISP) score of COVID-19. A confusion matrix is calculated for understanding the performance of classification based on the number of true-positives, true-negatives, false-positives, and false-negatives. These values further used to calculate the accuracy, precision, sensitivity, and F-scores.ResultsFrom the classification results, it is noted that the random forest classifier provides a classification accuracy of 99.7%, precision of 99.8%, sensitivity of 98.8%, and F-score of 99.29% for the given medical dataset.ConclusionProposed machine learning approach can help the individuals to take additional precautions for protecting people from the COVID-19 infection, and clinicians and government officials can focus on the highly susceptible people for limiting the pandemic spread.

Highlights

  • Introduction From public health perspectives ofCOVID-19 pandemic, accurate estimates of infection severity of individuals are extremely valuable for the informed decision-making and targeted response to an emerging pandemic

  • As the collected healthcare data may contain the missing values during the data collection, an effective preprocessing is essential, and it is carried out before it is applied to the machine learning models for the classification applications. & Supervised machine learning techniques consisting of random forest, support vector regression, linear regression, and neural networks for predicting infection susceptibility probability (ISP) score of COVID-19 using the labeled healthcare data & A centralized data collection system and android mobile application which can be useful for sharing the multimodal healthcare data and predicting the infection susceptibility score to the individuals, healthcare professionals, and government administrative officials

  • In the present work, proposed approach is implemented in Python computing and programming environment using the major computing libraries and mathematical functions such as Numpy, Pandas, and Scikit learn in Jupyter Notebook environment involving various machine learning algorithms such as random forest, logistic regression, support vector regression, linear regression, and neural networks

Read more

Summary

Introduction

Introduction From public health perspectives ofCOVID-19 pandemic, accurate estimates of infection severity of individuals are extremely valuable for the informed decision-making and targeted response to an emerging pandemic. This paper presents machine learning based prognostic model for providing early warning to the individuals for COVID-19 infection using the healthcare dataset. A prognostic model using random forest classifier and support vector regression is developed for predicting the infection susceptibility probability (ISP) score of COVID-19, and it is applied on an open healthcare dataset containing 27 field values. Android-based mobile application software platform is developed using the proposed prognostic approach for enabling the healthy individuals to predict the susceptibility infection score of COVID-19 to take the precautionary measures. An artificial intelligence-based rapid diagnosis approach for COVID-19 patients was developed using the analysis of chest X-ray images (Mei et al 2020). An interpretable mortality prediction model for COVID-19 patients is developed using the healthcare dataset (Yan et al 2020)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call