Abstract
AbstractBackgroundA large number of genetic variations have been identified to be associated with Alzheimer’s disease (AD) and related traits. But the downstream biology through which they exert effect on the development of AD remains unknown. Leveraging genomic, transcriptomic and proteomic data, and various biological networks, we proposed a modularity‐constrained logistic regression model (M‐logistic) to identify a set of functionally connected SNPs, genes and proteins related to AD.MethodThe GWAS genotype data, RNA‐Seq gene expression data and Protein expression data were obtained from the ROS/MAP Project. Detailed data processing steps can be found in AMP‐AD (https://www.synapse.org). In total, 179 subjects with full set of data were included (Table. 1). Using peptides as seed, we prefiltered the data and included 186 peptides, 743 unique genes and 822 Single‐nucleotide polymorphisms (SNPs) from upstream of their connected genes (Fig. 1). Functional interactions from REACTOME database [1] and the SNP‐gene mapping relationship were used to construct the prior network. These functional connected ‐omics features were used to classify AD patients from non‐symptomatic group, including cognitive normal and mild cognitive impairment.ResultAcross 10‐fold cross‐validation, we observed that M‐logistic largely outperforms other state‐of‐the‐art models (Table. 2). For feature selection, the proposed model identified 305 features predictive of disease status, which appeared in more than 6 folds. When mapped back to the prior network, these features formed three big sub‐networks with > 10 nodes (Fig. 2). For 20 genes in the largest sub‐network, their brain‐wide expression profile in the Allen Human Brain Atlas were found to be mostly correlated with the activation map of brain function term “vision” in Neurosynth [2] (Fig. 3).ConclusionM‐Logistic identified a set of SNPs, genes and proteins that are not only predictive of AD but also densely connected in the prior network. The trans‐omic paths in the selected sub‐networks indicates a potential pathway underlying the development of AD from SNPs to gene expression, protein expression and ultimately brain functional and structural changes. Future efforts are warranted to enable integrative analysis of multi‐omics data from decoupled subjects. [1] Fabregat et al., Nucleic acids research, 2018. [2] Yarkoni et al., Nature Method, 2011.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have