Abstract

Estimation of heritability of Alzheimer's Disease (AD) ranges from 49-79%, but the conventional single nucleotide variants (SNVs) identified to date account for <50% of the AD heritability. The involvement of genetic variants, such as SNVs and copy number variations (CNVs), has been proposed to explain part of the missing heritability. To accelerate AD genetic discovery, the Alzheimer's Disease Sequencing Project (ADSP) sequenced cases and cognitively normal elder controls from multiple ethnic groups. Leveraging this large-scale Whole Genome Sequence (WGS) collection, we could detect and conduct AD genetic analyses in full-spectrum of CNVs (small or large, common or rare, and coding or non-coding in genomic regions). We developed a bioinformatics pipeline consisting of four steps. First, we applied three calling algorithms, CNVnator, JAX-CNV, and Smoove, on each sample. Results of each tool were then merged by Svimmer. Second, we employed GraphTyper for genotyping. GraphTyper is a graph-pangenome-based method that may mitigate bias of human references. Finally, we resolved issues of multiple CNVs overlapping in a region by bedtools. Once the CNV callset was obtained, we conducted an association study to identify CNV regions that contribute to AD. We consider conventional association methods as well as a new developed CNV-curve-centric strategy which models CNVs as piecewise constant curves to assess the CNV effect on AD. For each sample, we identified an average of 7,959 CNVs by three algorithms. After merging CNVs of all samples, we obtained 56,316 deletions and 16,390 duplications for GraphTyper joint genotyping. The preliminary results of PLINK permutation-based test on the chromosome 19 indicated significant CNVs (p-value < 0.05 after adjusting for multiple tests) on genes UNC13A, IGSF23, KMT2B, LIN37, PSENEN, IGFLR1, and U2AF1L4. UNC13A, KMT2B, and PSENEN have been discovered to be associated in frontotemporal dementia, dystonia, and AD. Further validations of those CNVs are necessary. We developed a scalable CNV detection pipeline and applied it to 3,928 ADSP WGS samples. The preliminary results of the association study validate some known AD, dementia, and dystonia genes. To the best of our knowledge, this is the first large-scale CNV investigation of AD using WGS data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call