Polycystic Ovary Syndrome (PCOS) is a common endocrine disorder that affects women of reproductive age, leading to hormonal imbalances and ovarian dysfunction. Early detection and intervention are vital for effective management and prevention of complications. This study compares PCOS prediction using the XGBoost machine learning model against four traditional models: Logistic Regression (LR), Support Vector Machine (SVM), Decision Trees (DT), and Random Forests (RF). LR and SVM achieve accuracies of 95% and 96%, respectively, demonstrating strong predictive capabilities. In contrast, DT had a lower accuracy (82%), indicating limitations in PCOS data complexity. RF showed competitive performance with 96% accuracy, underscoring its effectiveness in ensemble learning. XGBoost achieves 98% accuracy with its parameter configuration. The scale pos weight parameter adjusts the positive class weight in im- balanced datasets, addressing underrepresentation by assigning more weight to the minority class, and thereby improving the training focus. The gradient boosting framework incrementally builds models to address complex feature interactions and dependencies, enhancing the accuracy and stability in predicting intricate PCOS dataset. This analysis highlights the importance of advanced machine learning models such as XGBoost for accurate and reliable PCOS predictions. This re- search advances PCOS prediction, demonstrates the potential of machine learning in healthcare, and clarifies the strengths and limitations of different algorithms with complex medical datasets.
Read full abstract