Abstract

Because of the limited information from the GAW20 samples when only case-control or trio data are considered, we propose eLBL, an extension of the Logistic Bayesian LASSO (least absolute shrinkage and selection operator) methodology so that both types of data can be analyzed jointly in the hope of obtaining an increased statistical power, especially for detecting association between rare haplotypes and complex diseases. The methodology is further extended to account for familial correlation among the case-control individuals and the trios. A 2-step analysis strategy was taken to first perform a genome-wise single single-nucleotide polymorphism (SNP) search using the Monte Carlo pedigree disequilibrium test (MCPDT) to determine interesting regions for the Adult Treatment Panel (ATP) binary trait. Then eLBL was applied to haplotype blocks covering the flagged SNPs in Step 1. Several significantly associated haplotypes were identified; most are in blocks contained in protein coding genes that appear to be relevant for metabolic syndrome. The results are further substantiated with a Type I error study and by an additional analysis using the triglyceride measurements directly as a quantitative trait.

Highlights

  • As next-generation sequencing (NGS) technology becomes more accurate and affordable, many recent studies have focused on assessing associations between common complex diseases and single-nucleotide variants (SNVs), paying particular attentions to those that are rare

  • Haplotype h11110 of block 1, which contains the minor alleles of single-nucleotide polymorphism (SNP) rs10915052 and rs1406862, is seen to have a fairly significant evidence of association with a Bayes factor (BF) of 15

  • That this haplotype contains the 2 minor alleles strongly suggests that the 2 SNPs may very well interact in cis and play a regulatory role for metabolic syndrome, as the block is not located within the coding region of a gene

Read more

Summary

Introduction

As next-generation sequencing (NGS) technology becomes more accurate and affordable, many recent studies have focused on assessing associations between common complex diseases and single-nucleotide variants (SNVs), paying particular attentions to those that are rare. Various methods have been proposed, but most can only achieve the identification of candidate genes or regions. To narrow the list of potential causal variants, it would be helpful to investigate haplotype blocks formed by single-nucleotide polymorphisms (SNPs) in regions/ genes where associations are suggested but may not necessarily be genome-wide significant. Apart from being able to identify biologically relevant variants, haplotype-based methods can be more powerful than SNV-based methods as multilocus genotypes contain. The GAW20 Real Data Package provides a good opportunity to apply LBL to identify haplotypes that are associated with metabolic syndrome.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call