Empowering Pregnancy Risk Assessment: A Web-Based Classification Framework with K-Means Clustering Enhanced Models

Bernard Pratama Wongso,Melissa Indah Fianty,Monika Evelin Johan

doi:10.51519/journalisi.v5i4.568

Abstract

This study aims to determine whether there is an increase in accuracy results for predicting pregnancy risk with a classification algorithm that goes through and without going through the clustering stage. After that, compare which classification algorithm gets the best improvement. This study uses the K-Means clustering approach, as well as the SVM, Naive Bayes, and K-Nearest Neighbor (KNN) classification algorithms. The pregnancy risk dataset used comes from the UCI Machine Learning Repository. Evaluation metrics used include accuracy, precision, recall, and F1-score. The results of the study revealed that the K-Means model with KNN provided the highest performance compared to the other two, with an accuracy of 79.53% and an average F1-score of 0.8. The implementation of K-Means resulted in an increase in accuracy of 0.4%, 1.57%, and 2.76% on KNN, SVM, and Naive Bayes respectively, which confirms the impact of clustering in improving classification performance. The resulting model can be used in real-time via a website built using the Flask API, and offers tools that can help health practitioners to plan treatments effectively and minimize the risk of pregnancy.

Full Text