Abstract

This study aims to determine whether there is an increase in accuracy results for predicting pregnancy risk with a classification algorithm that goes through and without going through the clustering stage. After that, compare which classification algorithm gets the best improvement. This study uses the K-Means clustering approach, as well as the SVM, Naive Bayes, and K-Nearest Neighbor (KNN) classification algorithms. The pregnancy risk dataset used comes from the UCI Machine Learning Repository. Evaluation metrics used include accuracy, precision, recall, and F1-score. The results of the study revealed that the K-Means model with KNN provided the highest performance compared to the other two, with an accuracy of 79.53% and an average F1-score of 0.8. The implementation of K-Means resulted in an increase in accuracy of 0.4%, 1.57%, and 2.76% on KNN, SVM, and Naive Bayes respectively, which confirms the impact of clustering in improving classification performance. The resulting model can be used in real-time via a website built using the Flask API, and offers tools that can help health practitioners to plan treatments effectively and minimize the risk of pregnancy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call