Abstract

BackgroundSequencing of both healthy and disease singletons yields many novel and low frequency variants of uncertain significance (VUS). Complete gene and genome sequencing by next generation sequencing (NGS) significantly increases the number of VUS detected. While prior studies have emphasized protein coding variants, non-coding sequence variants have also been proven to significantly contribute to high penetrance disorders, such as hereditary breast and ovarian cancer (HBOC). We present a strategy for analyzing different functional classes of non-coding variants based on information theory (IT) and prioritizing patients with large intragenic deletions.MethodsWe captured and enriched for coding and non-coding variants in genes known to harbor mutations that increase HBOC risk. Custom oligonucleotide baits spanning the complete coding, non-coding, and intergenic regions 10 kb up- and downstream of ATM, BRCA1, BRCA2, CDH1, CHEK2, PALB2, and TP53 were synthesized for solution hybridization enrichment. Unique and divergent repetitive sequences were sequenced in 102 high-risk, anonymized patients without identified mutations in BRCA1/2. Aside from protein coding and copy number changes, IT-based sequence analysis was used to identify and prioritize pathogenic non-coding variants that occurred within sequence elements predicted to be recognized by proteins or protein complexes involved in mRNA splicing, transcription, and untranslated region (UTR) binding and structure. This approach was supplemented by in silico and laboratory analysis of UTR structure.Results15,311 unique variants were identified, of which 245 occurred in coding regions. With the unified IT-framework, 132 variants were identified and 87 functionally significant VUS were further prioritized. An intragenic 32.1 kb interval in BRCA2 that was likely hemizygous was detected in one patient. We also identified 4 stop-gain variants and 3 reading-frame altering exonic insertions/deletions (indels).ConclusionsWe have presented a strategy for complete gene sequence analysis followed by a unified framework for interpreting non-coding variants that may affect gene expression. This approach distills large numbers of variants detected by NGS to a limited set of variants prioritized as potential deleterious changes.Electronic supplementary materialThe online version of this article (doi:10.1186/s12920-016-0178-5) contains supplementary material, which is available to authorized users.

Highlights

  • Sequencing of both healthy and disease singletons yields many novel and low frequency variants of uncertain significance (VUS)

  • It has not escaped our attention that the weighted probabilities obtained from this analysis could be represented and formalized using the same units of Shannon information as the other sequence changes we have described, analogous to single or multinucleotide gene variants predicted to affect nucleic acid binding sites

  • Through a comprehensive protocol based on highthroughput, information theory (IT)-based and complementary coding sequence analyses, the numbers of VUS can be reduced to a manageable quantity of variants, prioritized by predicted function

Read more

Summary

Introduction

Sequencing of both healthy and disease singletons yields many novel and low frequency variants of uncertain significance (VUS). Functional analysis of large numbers of these variants often cannot be performed, due to lack of relevant tissues, and the cost, time, and labor required for each variant Another problem is that in silico protein coding prediction tools exhibit inconsistent accuracy and are problematic for clinical risk evaluation [7,8,9]. Many HBOC patients undergoing genetic susceptibility testing will receive either an inconclusive (no BRCA variant identified) or an uncertain (BRCA VUS) result. The former has been reported in up to 80 % of cases and depends on the number of genes tested [10]. The inconsistency in diagnostic yield is significant, considering that HBOC accounts for 5–10 % of all breast/ovarian cancer [14, 15]

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call