Abstract
In this analysis, we investigate the contributions that linkage-based methods, such as identical-by-descent mapping, can make to association mapping to identify rare variants in next-generation sequencing data. First, we identify regions in which cases share more segments identical-by-descent around a putative causal variant than do controls. Second, we use a two-stage mixed-effect model approach to summarize the single-nucleotide polymorphism data within each region and include them as covariates in the model for the phenotype. We assess the impact of linkage disequilibrium in determining identical-by-descent states between individuals by using markers with and without linkage disequilibrium for the first part and the impact of imputation in testing for association by using imputed genome-wide association studies or raw sequence markers for the second part. We apply the method to next-generation sequencing longitudinal family data from Genetic Association Workshop 18 and identify a significant region at chromosome 3: 40249244-41025167 (p-value = 2.3 × 10−3).
Highlights
In genetic association studies, joint analysis of multiple single-nucleotide polymorphisms (SNPs) can be more powerful than separate SNP analysis because single markers typically either have small effect sizes or minor allele frequencies that are too small to reliably fit models [1]
There may be a middle ground in which multiple rare variants of moderate effect size play a key role in the etiology of some diseases
To assess the impact of linkage disequilibrium (LD) on our analysis, we present results from estimating IBD probabilities using markers with and without LD
Summary
Joint analysis of multiple single-nucleotide polymorphisms (SNPs) can be more powerful than separate SNP analysis because single markers typically either have small effect sizes (common variants) or minor allele frequencies that are too small to reliably fit models (rare variants) [1]. There may be a middle ground in which multiple rare variants of moderate effect size play a key role in the etiology of some diseases Such situations might be ideal for identity-by-descent (IBD) mapping [2]. In the first part of our analysis, we use the methods of Browning and Thompson [2] to identify regions in which cases share more segments of IBD around a putative causal variant than do controls After selecting these regions, we use a two-stage mixedeffects model approach, which was recently proposed by Tsonaka et al [4], to summarize the SNP data within each region and include them as covariates in the model for the phenotype. To increase our power to identify rare variants, we include the number of rare variants per region as a covariate in the model
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have