Novel bioinformatics quality control metric for next-generation sequencing experiments in the clinical context.

Maxim Ivanov,Artem Kasianov,Sergey Musienko,Ekaterina Rozhavskaya,Ancha Baranova,Vladislav Mileyko,Mikhail Ivanov

doi:10.1093/nar/gkz775

Abstract

As the use of next-generation sequencing (NGS) for the Mendelian diseases diagnosis is expanding, the performance of this method has to be improved in order to achieve higher quality. Typically, performance measures are considered to be designed in the context of each application and, therefore, account for a spectrum of clinically relevant variants. We present EphaGen, a new computational methodology for bioinformatics quality control (QC). Given a single NGS dataset in BAM format and a pre-compiled VCF-file of targeted clinically relevant variants it associates this dataset with a single arbiter parameter. Intrinsically, EphaGen estimates the probability to miss any variant from the defined spectrum within a particular NGS dataset. Such performance measure virtually resembles the diagnostic sensitivity of given NGS dataset. Here we present case studies of the use of EphaGen in context of BRCA1/2 and CFTR sequencing in a series of 14 runs across 43 blood samples and 504 publically available NGS datasets. EphaGen is superior to conventional bioinformatics metrics such as coverage depth and coverage uniformity. We recommend using this software as a QC step in NGS studies in the clinical context. Availability: https://github.com/m4merg/EphaGen or https://hub.docker.com/r/m4merg/ephagen.

Highlights

Next-generation sequencing has transformed the landscape of the whole field of medical genetics
We have developed EphaGen, an open-source application implemented in Perl/R, which can be used as a standalone version
Progress in the development of appropriate performance measures is essential to advancing applied science and engineering

Summary

Introduction

Next-generation sequencing has transformed the landscape of the whole field of medical genetics. It enhanced the performance of the genetic testing as well as expanded and facilitated understanding of clinical genetics [1–4]. Decades of research efforts and routine testing shed light on the spectrum of variations in human genes, associated with a wide range of genetic disorders and their clinical significance in terms of variable penetrance and expressivity [5]. For the most wide-spread genetic diseases, numeric research collaborations and public databases provided information on common and population specific minor allele frequencies for clinically significant variants. As of May 2018, Breast Cancer Information Core database [6] contains information on relative clinically relevant variants across BRCA1 and BRCA2 genes, implicated in hereditary breast cancer development, based on the 11 344 affected population size

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nucleic acids research	Publication Date: Sep 12, 2019
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Novel bioinformatics quality control metric for next-generation sequencing experiments in the clinical context.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nucleic acids research

Lead the way for us

Similar Papers

NGSCheckMate: software for validating sample identity in next-generation sequencing studies within and across data types.
Sejoon Lee ... Woong-Yang Park
Nucleic Acids Research | VOL. 45
Sejoon Lee, et. al.Sejoon Lee ... Woong-Yang Park
23 Mar 2017
Nucleic Acids Research | VOL. 45

SeqAssist: a novel toolkit for preliminary analysis of next-generation sequencing data.
Yan Peng ... Nan Wang
BMC Bioinformatics | VOL. Suppl 15 11
Yan Peng, et. al.Yan Peng ... Nan Wang
21 Oct 2014
BMC Bioinformatics | VOL. Suppl 15 11

Enhancing miRNA annotation confidence in miRBase by continuous cross dataset analysis
Thomas B Hansen ... Jesper B Bramsen
RNA Biology | VOL. 8
Thomas B Hansen, et. al.Thomas B Hansen ... Jesper B Bramsen
01 May 2011
RNA Biology | VOL. 8

A comparative study of k-spectrum-based error correction methods for next-generation sequencing data analysis.
Isaac Akogwu ... Nan Wang
Human Genomics | VOL. Suppl 10 2
Isaac Akogwu, et. al.Isaac Akogwu ... Nan Wang
01 Jul 2016
Human Genomics | VOL. Suppl 10 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Novel bioinformatics quality control metric for next-generation sequencing experiments in the clinical context.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nucleic acids research