Abstract

Public health and its related facilities are crucial for thriving cities and societies. The optimum utilization of health resources saves money and time, but above all, it saves precious lives. It has become even more evident in the present as the pandemic has overstretched the existing medical resources. Specific to patient appointment scheduling, the casual attitude of missing medical appointments (no-show-ups) may cause severe damage to a patient's health. In this paper, with the help of machine learning, we analyze six million plus patient appointment records to predict a patient's behaviors/characteristics by using ten different machine learning algorithms. For this purpose, we first extracted meaningful features from raw data using data cleaning. We applied Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling Method (Adasyn), and random undersampling (RUS) to balance our data. After balancing, we applied ten different machine learning algorithms, namely, random forest classifier, decision tree, logistic regression, XG Boost, gradient boosting, Adaboost Classifier, Naive Bayes, stochastic gradient descent, multilayer perceptron, and Support Vector Machine. We analyzed these results with the help of six different metrics, i.e., recall, accuracy, precision, F1-score, area under the curve, and mean square error. Our study has achieved 94% recall, 86% accuracy, 83% precision, 87% F1-score, 92% area under the curve, and 0.106 minimum mean square error. Effectiveness of presented data cleaning and feature selection is confirmed by better results in all training algorithms. Notably, recall is greater than 75%, accuracy is greater than 73%, F1-score is more significant than 75%, MSE is lesser than 0.26, and AUC is greater than 74%. The research shows that instead of individual features, combining different features helps make better predictions of a patient's appointment status.

Highlights

  • During the COVID-19 pandemic, the world has experienced that the care of critical patients is most strenuous for the health system

  • Show-no-show random undersampling (RUS) appointment prediction results evaluated by mean square error (MSE) and area under ROC curve (AUC) are often used to measure the performance of models

  • Mean square error is minimum in Synthetic Minority Oversampling Technique (SMOTE), i.e., 0.1069 under random forest, while the area under the curve is maximum given by SMOTE 92.09% under random forest

Read more

Summary

Introduction

During the COVID-19 pandemic, the world has experienced that the care of critical patients is most strenuous for the health system. Governments have opted for full long-term lockdowns as preventive measures to keep the numbers of urgent care patients low. There are many reasons due to which a patient may reach such a critical state. One of which is not following up with the Primary Care Provider (PCP). The complete treatment of any disease or health issue requires proper treatment and multiple patient visits to PCP. PCP needs to plan policies that will provide appropriate alerts/notifications to the patients in a difficult situation and failing to follow up.

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call