Abstract

The cholera epidemic remains a public threat throughout history, affecting vulnerable populations living with unreliable water and substandard sanitary conditions. Various studies have observed that the occurrence of cholera has a strong linkage with environmental factors such as climate change and geographical location. Poor Hygiene has been strongly linked to the seasonal occurrence and widespread of cholera through the creation of weather patterns that favour the disease’s transmission, infection, and the growth of Vibrio cholerae, which cause the disease. Over the past decades, there have been great achievements in developing epidemic models for the proper prediction of cholera. However, machine learning techniques have not been explicitly deployed in modelling cholera epidemics due to the challenges that come with its datasets, such as imbalanced data and missing information. This paper explores the use of machine learning algorithms such as decision tree, random forest, and logistics regression to evaluate the prevalence of cholera epidemics in West African countries while overcoming the data imbalance problem. In addition, mean square error, mean absolute error, F1 score, precision and balanced accuracy metrics were used to evaluate the performance of the three (3) models. The results show that logistic regression has an accuracy of 0.47%, random forest 0.978% and the most efficient model was the decision tree 0.998% with a mean squared error and mean absolute error of 0.001% respectively shows that the model will accurately predict cholera outbreak in Africa. Overall results will improve the understanding of the significant roles of machine learning techniques in healthcare data. The study recommends a review of healthcare systems to facilitate quality data collection and deployment of machine learning techniques

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call