Abstract

The ADAMTS7 locus has been identified in multiple recent genome-wide association studies as being associated with coronary artery disease (CAD) in humans. Based on these associations, we hypothesized that functional rare variants in the gene may be causal risk alleles for CAD in humans. In order to identify rare variants in the human ADAMTS7 gene, we used data from the NHLBI Exome Sequencing Project (ESP) of 6503 participants, which, as per Exome Variant Server (data release ESP6500SI-V2), identified 404 variants in the entire genic sequence of ADAMTS7, of which 169 were missense variants and 2 were nonsense variants. We acquired BAM files for the ADAMTS7 region in 5438 ESP samples and performed in-house realignment, filtering and variant calling for the ADAMTS7 gene using the following criteria; sequences needed ≥90% alignment to the reference sequence, as well as ≥2% better alignment to the actual ADAMTS7 gene vs. pseudogenes or else they were excluded from analysis. In these custom analyses we identified 103 missense and 2 nonsense variants. In addition, we performed Sanger sequencing of 200 PennCath patients (100 MI cases, 100 controls) using primer design that accounted for pseudogenes. For variant calling we only used data with 2X coverage in >50% of samples tested (19/28 amplicons). The Sanger sequence data revealed 27 missense and zero nonsense variants. Comparison of the custom analysis of ESP data and independent Sanger sequencing provided high-confidence calls for 12 coding variants (including the S214P variant, a lead SNP in the GWAS association) in the ADAMTS7 gene. In summary, several variants in the ADAMTS7 gene included in the Exome Variant Server appear to arise from pseudogenes, while a number of high-confidence ADAMTS7 variants are not included on the exome chip. By combining multiple sequencing technologies from multiple studies, we identified 12 variants located in the actual coding sequence of ADAMTS7 for future functional studies. In summary, pseuodogenes present a unique challenge to identifying coding variants in the age of exome and whole genome sequencing and require a customized strategy for sequence alignment and variant calling. ADAMTS7 provides a cautionary example and we present one approach to overcome these challenges.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call