Abstract
Multiple Nucleotide Variants (MNVs) are miscalled by the most widely utilised next generation sequencing analysis (NGS) pipelines, presenting the potential for missing diagnoses that would previously have been made by standard Sanger (dideoxy) sequencing. These variants, which should be treated as a single insertion-deletion mutation event, are commonly called as separate single nucleotide variants. This can result in misannotation, incorrect amino acid predictions and potentially false positive and false negative diagnostic results. This risk will be increased as confirmatory Sanger sequencing of Single Nucleotide variants (SNVs) ceases to be standard practice. Using simulated data and re-analysis of sequencing data from a diagnostic targeted gene panel, we demonstrate that the widely adopted pipeline, GATK best practices, results in miscalling of MNVs and that alternative tools can call these variants correctly. The adoption of calling methods that annotate MNVs correctly would present a solution for individual laboratories, however GATK best practices are the basis for important public resources such as the gnomAD database. We suggest integrating a solution into these guidelines would be the optimal approach.
Highlights
Any reports and responses or comments on the article can be found at the end of the article
By simulating Multiple Nucleotide Variants (MNVs) in Next Generation Sequencing (NGS) sequencing data and testing for them using a typical NGS pipeline employed by an NHS diagnostic laboratory, we demonstrate that MNVs are incorrectly annotated by standard diagnostic NGS pipelines, potentially generating false positive and false negative results and negatively impacting on patient care
Simulated MNVs are miscalled using GATK best practices All five of the simulated MNVs described above were called as two separate SNVs using GATK best practices, and
Summary
Any reports and responses or comments on the article can be found at the end of the article. Any further responses from the reviewers can be found at the end of the article Introduction The rapid progress and reduced cost of Generation Sequencing (NGS) has transformed approaches to genomic research and clinical diagnostic testing[1]. While single-gene tests, for instance using Sanger (dideoxy) sequencing, will produce a short list of variants which can be manually evaluated, this is not feasible for generation analysis. Sequencing at this scale requires highly automated analysis pipelines. High throughput sequencing services are dependent on automated tools to annotate and classify variants by potential consequence. NGS pipelines that annotate these MNVs as two independent SNVs could fail to correctly identify a pathogenic variant, potentially negatively impacting on clinical care
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.