Abstract

BackgroundAs whole exome sequencing (WES) and whole genome sequencing (WGS) transition from research tools to clinical diagnostic tests, it is increasingly critical for sequencing methods and analysis pipelines to be technically accurate. The Genome in a Bottle Consortium has recently published a set of benchmark SNV, indel, and homozygous reference genotypes for the pilot whole genome NIST Reference Material based on the NA12878 genome.MethodsWe examine the relationship between human genome complexity and genes/variants reported to be associated with human disease. Specifically, we map regions of medical relevance to benchmark regions of high or low confidence. We use benchmark data to assess the sensitivity and positive predictive value of two representative sequencing pipelines for specific classes of variation.ResultsWe observe that the accuracy of a variant call depends on the genomic region, variant type, and read depth, and varies by analytical pipeline. We find that most false negative WGS calls result from filtering while most false negative WES variants relate to poor coverage. We find that only 74.6 % of the exonic bases in ClinVar and OMIM genes and 82.1 % of the exonic bases in ACMG-reportable genes are found in high-confidence regions. Only 990 genes in the genome are found entirely within high-confidence regions while 593 of 3,300 ClinVar/OMIM genes have less than 50 % of their total exonic base pairs in high-confidence regions. We find greater than 77 % of the pathogenic or likely pathogenic SNVs currently in ClinVar fall within high-confidence regions. We identify sites that are prone to sequencing errors, including thousands present in publicly available variant databases. Finally, we examine the clinical impact of mandatory reporting of secondary findings, highlighting a false positive variant found in BRCA2.ConclusionsTogether, these data illustrate the importance of appropriate use and continued improvement of technical benchmarks to ensure accurate and judicious interpretation of next-generation DNA sequencing results in the clinical setting.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-016-0269-0) contains supplementary material, which is available to authorized users.

Highlights

  • As whole exome sequencing (WES) and whole genome sequencing (WGS) transition from research tools to clinical diagnostic tests, it is increasingly critical for sequencing methods and analysis pipelines to be technically accurate

  • Accuracy of variant calls in high-confidence regions In the high-confidence regions, we assessed the accuracy of variant calls from Illumina whole genome (BWA MEM followed by Platypus) and Illumina Nextera exome sequencing (BWA followed by GATK)

  • We compared the performance for different types of potentially functional SNVs in medically relevant genes from ClinVar/OMIM as well as genome wide

Read more

Summary

Introduction

As whole exome sequencing (WES) and whole genome sequencing (WGS) transition from research tools to clinical diagnostic tests, it is increasingly critical for sequencing methods and analysis pipelines to be technically accurate. Themes throughout this work include low concordance across platforms for insertiondeletion variants, and moderate concordance between interpreters of genomic variants [1, 5, 25]. In this analysis, we characterize the GIAB [26] highconfidence regions, benchmark WGS and WES example variant calls in relation to publicly available high-confidence consensus SNV, indel, and homozygous reference genotypes for NA12878, and evaluate the clinical impact of genomic sites with systematic errors from one or more sequencing platforms. We compare performance across the whole genome to performance for different types of potentially functional variants in genes that have different levels of evidence for disease association and clinical actionability

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call