BackgroundClinically, behavior, cognitive, and mental functions are affected during the neurodegenerative disease progression. To date, the molecular pathogenesis of these complex disease is still unclear. With the rapid development of sequencing technologies, it is possible to delicately decode the molecular mechanisms corresponding to different clinical phenotypes at the genome-wide transcriptomic level using computational methods. Our previous studies have shown that it is difficult to distinguish disease genes from non-disease genes. Therefore, to precisely explore the molecular pathogenesis under complex clinical phenotypes, it is better to identify biomarkers corresponding to different disease stages or clinical phenotypes. So, in this study, we designed a label propagation-based semi-supervised feature selection approach (LPFS) to prioritize disease-associated genes corresponding to different disease stages or clinical phenotypes.MethodsIn this study, we pioneering put label propagation clustering and feature selection into one framework and proposed label propagation-based semi-supervised feature selection approach. LPFS prioritizes disease genes related to different disease stages or phenotypes through the alternative iteration of label propagation clustering based on sample network and feature selection with gene expression profiles. Then the GO and KEGG pathway enrichment analysis were carried as well as the gene functional analysis to explore molecular mechanisms of specific disease phenotypes, thus to decode the changes in individual behavioral and mental characteristics during neurodegenerative disease progression.ResultsLarge amounts of experiments were conducted to verify the performance of LPFS with Huntington’s gene expression data. Experimental results shown that LPFS performs better in comparison with the-state-of-art methods. GO and KEGG enrichment analysis of key gene sets shown that TGF-beta signaling pathway, cytokine-cytokine receptor interaction, immune response, and inflammatory response were gradually affected during the Huntington’s disease progression. In addition, we found that the expression of SLC4A11, ZFP474, AMBP, TOP2A, PBK, CCDC33, APSL, DLGAP5, and Al662270 changed seriously by the development of the disease.ConclusionsIn this study, we designed a label propagation-based semi-supervised feature selection model to precisely selected key genes of different disease phenotypes. We conducted experiments using the model with Huntington’s disease mice gene expression data to decode the mechanisms of it. We found many cell types, including astrocyte, microglia, and GABAergic neuron, could be involved in the pathological process.
Read full abstract