Introduction Frequent no-shows in health care appointments pose significant challenges, leading to wasted resources and suboptimal patient outcomes. Traditional mitigation methods, such as reminder messages and phone calls, often fall short, particularly in regions with less robust healthcare infrastructures. This study leverages machine learning techniques to develop a predictive model that identifies patients at high risk of missing appointments using data from the Mawid scheduling system in Southern Saudi Arabia. Methods A retrospective cross-sectional design was employed, focusing on visits to primary healthcare centers [PHCCs] and general hospitals. Data spanning 10 months from June 2023 to March 2024 were collected from Mawid, encompassing over one million observations and 18 features, including appointment details, patient demographics, and weather conditions. Machine learning models, such as Decision Trees, Random Forests, Naive Bayes, Logistic Regression, and Artificial Neural Networks [ANN], have been developed and evaluated based on accuracy, precision, recall, F1 Score, and Area under the curve (AUC). Results The findings revealed that 55.07% of appointments were not attended. The Random Forest model exhibited superior performance with an accuracy of 0.765 and an AUC of 0.852, particularly when weather-related features were included. The ANN models also performed robustly with an AUC value of approximately 0.836. This study identified significant regional, seasonal, and environmental factors affecting no-show rates, with higher no-show rates occurring during certain months and under specific weather conditions. Regular appointments and PHCCs showed different attendance patterns than hospitals, while walk-in appointments did. Conclusion Machine learning models, particularly Random Forests and ANNs, can effectively predict healthcare appointment no-shows, thereby allowing for better resource allocation and patient care. Recognizing the influence of regional and environmental factors is crucial for developing targeted interventions to reduce the no-show rates. Future research should explore integrating more contextual data to further refine these predictive models and to enhance healthcare delivery and operational efficiency.
Read full abstract