Abstract

Background: Endometriosis is a debilitating gynecological disorder characterized by chronic pain, infertility, and the growth of endometrial tissue outside the uterus. Accurate and early detection of this condition is crucial for effective management and treatment. Methods: We developed a gene rank matrix-based model to integrate endometriosis cohorts across multiple platforms. After removing batch effects, we identified 83 genes associated with endometriosis and further refined a diagnostic model using 11 of these genes. The model was trained on two platforms and validated on two others using SVM, Random Forest, Logistic Regression, and gradient-boosting machine learning algorithms. Results: The integration via the gene rank matrix effectively mitigated batch effects. Utilizing a gradient boosting classifier with a subset of 11 genes, the model demonstrated commendable diagnostic efficacy, achieving an Area Under the Curve (AUC) of 0.77, an accuracy of 0.72, and an F1 score of 0.72 for the training dataset. When subjected to validation, the model maintained its performance, yielding an AUC of 0.769, an accuracy of 0.719, and an F1 score of 0.732. These 11 genes were found to be associated with immunosuppression. Conclusion: Our approach to integrating gene rank matrices effectively consolidates endometriosis data across diverse platforms. The diagnostic model, harnessing the predictive power of 11 specific genes, surpasses alternative models, thereby offering promising prospects for aiding clinical diagnosis of endometriosis. Further validation is imperative to elucidate the functional significance of these 11 genes. Our study underscores the potential of data integration coupled with machine learning techniques in advancing the diagnosis of intricate diseases, such as endometriosis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call