Abstract

Polycystic Ovary Syndrome (PCOS) is the most frequent endocrinopathy in women of reproductive age. Machine learning (ML) is the area of artificial intelligence with a focus on predictive computing algorithms. We aimed to define the most relevant clinical and laboratory variables related to PCOS diagnosis, and to stratify patients into different phenotypic groups (clusters) using ML algorithms. Variables from a database comparing 72 patients with PCOS and 73 healthy women were included. The BorutaShap method, followed by the Random Forest algorithm, was applied to prediction and clustering of PCOS. Among the 58 variables investigated, the algorithm selected in decreasing order of importance: lipid accumulation product (LAP); abdominal circumference; thrombin activatable fibrinolysis inhibitor (TAFI) levels; body mass index (BMI); C-reactive protein (CRP), high-density lipoprotein cholesterol (HDL-c), follicle-stimulating hormone (FSH) and insulin levels; HOMA-IR value; age; prolactin, 17-OH progesterone and triglycerides levels; and family history of diabetes mellitus in first-degree relative as the variables associated to PCOS diagnosis. The combined use of these variables by the algorithm showed an accuracy of 86% and area under the ROC curve of 97%. Next, PCOS patients were gathered into two clusters in the first, the patients had higher BMI, abdominal circumference, LAP and HOMA-IR index, as well as CRP and insulin levels compared to the other cluster. The developed algorithm could be applied to select more important clinical and biochemical variables related to PCOS and to classify into phenotypically different clusters. These results could guide more personalized and effective approaches to the treatment of PCOS.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call