Abstract

Multiple Nucleotide Variants (MNVs) are miscalled by the most widely utilised next generation sequencing analysis (NGS) pipelines, presenting the potential for missing diagnoses that would previously have been made by standard Sanger (dideoxy) sequencing. These variants, which should be treated as a single insertion-deletion mutation event, are commonly called as separate single nucleotide variants. This can result in misannotation, incorrect amino acid predictions and potentially false positive and false negative diagnostic results. This risk will be increased as confirmatory Sanger sequencing of Single Nucleotide variants (SNVs) ceases to be standard practice. Using simulated data and re-analysis of sequencing data from a diagnostic targeted gene panel, we demonstrate that the widely adopted pipeline, GATK best practices, results in miscalling of MNVs and that alternative tools can call these variants correctly. The adoption of calling methods that annotate MNVs correctly would present a solution for individual laboratories, however GATK best practices are the basis for important public resources such as the gnomAD database. We suggest integrating a solution into these guidelines would be the optimal approach.

Highlights

  • Any reports and responses or comments on the article can be found at the end of the article

  • By simulating Multiple Nucleotide Variants (MNVs) in Next Generation Sequencing (NGS) sequencing data and testing for them using a typical NGS pipeline employed by an NHS diagnostic laboratory, we demonstrate that MNVs are incorrectly annotated by standard diagnostic NGS pipelines, potentially generating false positive and false negative results and negatively impacting on patient care

  • Simulated MNVs are miscalled using GATK best practices All five of the simulated MNVs described above were called as two separate SNVs using GATK best practices, and

Read more

Summary

Introduction

Any reports and responses or comments on the article can be found at the end of the article. Any further responses from the reviewers can be found at the end of the article Introduction The rapid progress and reduced cost of Generation Sequencing (NGS) has transformed approaches to genomic research and clinical diagnostic testing[1]. While single-gene tests, for instance using Sanger (dideoxy) sequencing, will produce a short list of variants which can be manually evaluated, this is not feasible for generation analysis. Sequencing at this scale requires highly automated analysis pipelines. High throughput sequencing services are dependent on automated tools to annotate and classify variants by potential consequence. NGS pipelines that annotate these MNVs as two independent SNVs could fail to correctly identify a pathogenic variant, potentially negatively impacting on clinical care

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call