Abstract

Given that improved imputation software and high-coverage whole genome sequence (WGS)-based haplotype reference panels now enable inexpensive approximation of WGS genotype data, we hypothesised that WGS-based imputation and analysis of existing ExomeChip-based genome-wide association (GWA) data will identify novel intronic and intergenic single nucleotide polymorphism (SNP) effects associated with complex disease risk. In this study, we reanalysed a Parkinson’s disease (PD) dataset comprising 5540 cases and 5862 controls genotyped using the ExomeChip-based NeuroX array. After genotype imputation and extensive quality control, GWA analysis was performed using PLINK and a recently developed machine learning approach (GenEpi), to identify novel, conditional and joint genetic effects associated with PD. In addition to improved validation of previously reported loci, we identified five novel genome-wide significant loci associated with PD: three (rs137887044, rs78837976 and rs117672332) with 0.01 < MAF < 0.05, and two (rs187989831 and rs12100172) with MAF < 0.01. Conditional analysis within genome-wide significant loci revealed four loci (p < 1 × 10−5) with multiple independent risk variants, while GenEpi analysis identified SNP–SNP interactions in seven genes. In addition to identifying novel risk loci for PD, these results demonstrate that WGS-based imputation and analysis of existing exome genotype data can identify novel intronic and intergenic SNP effects associated with complex disease risk.

Highlights

  • Over the past decade, genome-wide association studies (GWAS) have successfully identified many individual common genetic variants (i.e., single nucleotide polymorphisms (SNPs)) associated with the risk of a wide range of complex diseases

  • In order to identify this missing heritability of complex diseases, it is important to explore the role of low-frequency SNPs: SNPs with minor allele frequency (MAF) less than 0.05 at novel or established risk loci and the potential interaction between SNPs that might have a strong contribution towards disease risk compared to their main effects

  • Given that improved imputation software and whole genome sequence (WGS)-based haplotype reference panels enable inexpensive approximation of WGS genotype data, we hypothesised that WGS-based imputation and analysis of existing ExomeChip-based GWAS data will identify novel intronic and intergenic SNP effects associated with complex disease risk

Read more

Summary

Introduction

Genome-wide association studies (GWAS) have successfully identified many individual common genetic variants (i.e., single nucleotide polymorphisms (SNPs)) associated with the risk of a wide range of complex diseases. Due to insufficient statistical power, the genetic effects identified by typical GWAS studies tend to explain only a small fraction of the overall genetic variation underlying complex diseases [1]. Because most GWAS studies focus on generating genetic data in new samples and use standard statistical tools to detect common SNPs with marginal effects, they do not identify heterogeneous effects or epistasis interaction effects of multiple SNPs. Next-generation sequencing (NGS) technology allowed the development and use of cost-effective genotyping arrays to efficiently genotype and assess common genome-wide genetic variation in large samples, leading to the discovery of thousands of risk SNPs for many complex diseases. New methodologies have been recently developed using machine learning approaches to efficiently discover joint genetic effects of variants contributing towards complex disease risk by tackling the challenges of traditional

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call