Abstract
Although germline copy-number variants (CNVs) are the genetic cause of multiple hereditary diseases, detecting them from targeted next-generation sequencing data (NGS) remains a challenge. Existing tools perform well for large CNVs but struggle with single and multi-exon alterations. The aim of this work is to evaluate CNV calling tools working on gene panel NGS data and their suitability as a screening step before orthogonal confirmation in genetic diagnostics strategies. Five tools (DECoN, CoNVaDING, panelcn.MOPS, ExomeDepth, and CODEX2) were tested against four genetic diagnostics datasets (two in-house and two external) for a total of 495 samples with 231 single and multi-exon validated CNVs. The evaluation was performed using the default and sensitivity-optimized parameters. Results showed that most tools were highly sensitive and specific, but the performance was dataset dependant. When evaluating them in our diagnostics scenario, DECoN and panelcn.MOPS detected all CNVs with the exception of one mosaic CNV missed by DECoN. However, DECoN outperformed panelcn.MOPS specificity achieving values greater than 0.90 when using the optimized parameters. In our in-house datasets, DECoN and panelcn.MOPS showed the highest performance for CNV screening before orthogonal confirmation. Benchmarking and optimization code is freely available at https://github.com/TranslationalBioinformaticsIGTP/CNVbenchmarkeR.
Highlights
IntroductionDetection of large rearrangements such as copy-number variants (CNV) from next-generation sequencing data (NGS) data is still challenging due to issues intrinsic to the technology including short read lengths and GC-content bias [1]
Supplementary information The online version of this article contains supplementary material, which is available to authorized users.Next-generation sequencing (NGS) is an outstanding technology to detect single-nucleotide variants and small deletion and insertion variants in genetic testing for Mendelian conditions
To identify the copy-number variants (CNVs) calling tools that could be used as a screening step in a genetic diagnostics setting, we needed first to select the candidate tools, and to evaluate their performance with a special emphasis on the sensitivity, both with their default parameters and with dataset-dependent optimized parameters
Summary
Detection of large rearrangements such as copy-number variants (CNV) from NGS data is still challenging due to issues intrinsic to the technology including short read lengths and GC-content bias [1]. The gold standards for CNV detection in genetic diagnostics are multiplex ligation-dependent probe amplification (MLPA) and array comparative genomic hybridization (aCGH) [3, 4]. Both methods are time consuming and costly, so frequently only a subset of genes is tested, excluding others from the analysis, especially when using single-gene approaches. The possibility of using NGS data as a first CNV screening step would decrease the number of MLPA/aCGH tests required and would free up resources
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have