Abstract

Whole genome sequencing (WGS) has the potential to outperform clinical microarrays for the detection of structural variants (SV) including copy number variants (CNVs), but has been challenged by high false positive rates. Here we present ClinSV, a WGS based SV integration, annotation, prioritization, and visualization framework, which identified 99.8% of simulated pathogenic ClinVar CNVs > 10 kb and 11/11 pathogenic variants from matched microarrays. The false positive rate was low (1.5–4.5%) and reproducibility high (95–99%). In clinical practice, ClinSV identified reportable variants in 22 of 485 patients (4.7%) of which 35–63% were not detectable by current clinical microarray designs. ClinSV is available at https://github.com/KCCG/ClinSV.

Highlights

  • Genomic structural variant(s) (SV(s)), including copy number variant(s) (CNV(s)), are an important source of genetic variation, and it is well established that large Copy number variant(s)/variation(s) (CNV) are an important cause of many inherited human genetic diseases [1,2,3]

  • ClinSV To identify clinically relevant CNVs and copy numberneutral structural variants (SV) from Illumina short-read Whole genome sequencing (WGS) data, we developed ClinSV, which is an SV integration, annotation, prioritization, and visualization framework, summarized in Fig. 1c, and described here

  • ClinSV integrates CNV calls from CNVnator [14], which uses evidence only from depth of coverage (DOC), and Lumpy [15], which uses evidence only from Discordant mapping read pairs (DP) and split reads (SR)

Read more

Summary

Introduction

Genomic structural variant(s) (SV(s)), including copy number variant(s) (CNV(s)), are an important source of genetic variation, and it is well established that large CNVs (typically > 100 kb) are an important cause of many inherited human genetic diseases [1,2,3]. Accredited array comparative genome hybridization (aCGH) or single nucleotide polymorphism (SNP) microarrays (MA) are currently first-line clinical laboratory tests used to diagnose patients with many rare genetic diseases. Common among these conditions are intellectual disability [4] and autism [5]. There are numerous recent reports using WGS to identify short CNV, affecting single genes, or even single exons as the cause of many monogenic disorders, suggesting considerable potential for short CNVs below the limit of detection of traditional MA to explain a proportion of previously undiagnosed patients [4, 10, 11]. The ability to find short, potentially pathogenic CNVs raises significant new interpretation challenges, as there are thousands of benign short CNVs in healthy individuals [12, 13]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call