Abstract

This paper focuses on the data-driven diagnosis of polycystic ovary syndrome (PCOS) in women. For this, machine learning algorithms are applied to a dataset freely available in Kaggle repository. This dataset has 43 attributes of 541 women, among which 177 are patients of PCOS disease. Firstly, univariate feature selection algorithm is applied to find the best features that can predict PCOS. The ranking of the attributes is computed and it is found that the most important attribute is the ratio of Follicle stimulating hormone (FSH) and Luteinizing hormone (LH), Next, holdout and cross validation methods are applied to the dataset to separate the training and testing data. A number of classifiers such as gradient boosting, random forest, logistic regression, and hybrid random forest and logistic regression (RFLR) are applied to the attributed are good enough to predict the PCOS disease

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call