Abstract

Advances in next-generation sequencing technology are enabling researchers to capture a comprehensive picture of genomic variation across large numbers of individuals with unprecedented levels of efficiency. The main analytic challenge in disease mapping is how to mine the data for rare causal variants among a sea of neutral variation. To achieve this goal, investigators have proposed a number of methods that exploit biological knowledge. In this paper, I propose applying a Bayesian stochastic search variable selection algorithm in this context. My multivariate method is inspired by the combined multivariate and collapsing method. In this proposed method, however, I allow an arbitrary number of different sources of biological knowledge to inform the model as prior distributions in a two-level hierarchical model. This allows rare variants with similar prior distributions to share evidence of association. Using the 1000 Genomes Project single-nucleotide polymorphism data provided by Genetic Analysis Workshop 17, I show that through biologically informative prior distributions, some power can be gained over noninformative prior distributions.

Highlights

  • Genome-wide association studies (GWAS) have been a powerful method for revealing common variants that confer a modest increase in disease risk in carriers

  • The single-nucleotide polymorphisms (SNPs) that show the strongest evidence for association in GWAS do not perfectly tag the putative causal variant (s) nearby because of ancestral recombination events; resequencing in these regions is necessary to resolve the precise location of the causal variant(s)

  • Dickson et al [1] postulated one possible explanation for why many fine-mapping efforts have failed to map a single causal SNP in the region tagged by the original genome-wide association signal: multiple rare variants (MRVs) residing on multiple haplotypes at the region of the genome-wide association signal are generating a “synthetic” association when these haplotypes share a common allele that is observed more in case subjects than in control subjects

Read more

Summary

Background

Genome-wide association studies (GWAS) have been a powerful method for revealing common variants that confer a modest increase in disease risk in carriers. The single-nucleotide polymorphisms (SNPs) that show the strongest evidence for association in GWAS do not perfectly tag the putative causal variant (s) nearby because of ancestral recombination events; resequencing in these regions is necessary to resolve the precise location of the causal variant(s). Dickson et al [1] postulated one possible explanation for why many fine-mapping efforts have failed to map a single causal SNP in the region tagged by the original genome-wide association signal: multiple rare variants (MRVs) residing on multiple haplotypes at the region of the genome-wide association signal are generating a “synthetic” association when these haplotypes share a common allele that is observed more in case subjects than in control subjects. The exon resequencing data set provided by the organizers of Genetic Analysis Workshop 17 (GAW17) provides an ideal opportunity for evaluating the performance of this new approach

Methods
Results
Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call