Abstract

Background: Parkinson’s Disease (PD) is a neurodegenerative condition that takes 15 to 20 years to exhibit the symptoms. However, during the initial stages of the disease, there are microlevel changes in the body that are typically undetected. The pathogenesis of Parkinson’s disease requires understanding of these microlevel changes. Objectives: The objective of the present study is to develop a hybrid ensembled machine learning pipeline model for identifying inter-stage Parkinson’s Disease (PD) biomarkers for understanding disease progression as well as etiology. Methods: The proposed work was carried out on the dataset GSE202667 from Gene Expression Omnibus, containing time-resolved RNA signatures of CD4+ T cells at various stages. Differentiating genes were identified in different interstage groups. Two types of unsupervised learning methods- distance-based (K-means, Agglomerative, Density-Based Spatial Clustering of Application with Noise) and probability-based (Hidden Markov Model, Gaussian Mixture Model Latent Dirichlet Allocation) were applied. The best algorithms were selected and applied to optimize clusters. Enrichment analysis was conducted on the top 10 PD biomarkers in each category. Findings: The top 10 PD biomarkers in each category are identified and gene set enrichment analysis resulted into their enrichment in three KEGG Pathways and depletion in two GO molecular functions. These biomarkers’ depletion is observed in 215 Reactome pathways and enrichment in 18 Reactome pathways. Twelve GO-Cellular components had an enrichment whereas 111 GO-Cellular components had a depletion in the gene set. A total of 25 GO-Biological process components were enriched, while 339 GO-Biological process components were depleted. Novelty: The proposed hybrid ensembled machine learning pipeline model works as a tool to identify Parkinson’s Disease biomarkers from omics data. The model contributes to identify implicit patterns in omics data in order to unveil Parkinson’s disease progress mechanism through biomarkers discovery. Keywords: Gaussian Mixture Model clustering, Early Detection, k-means clustering, Machine learning, Parkinson’s Disease Interstage Biomarkers

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.