Abstract

BackgroundTargeted next-generation sequencing (NGS) has been widely used as a cost-effective way to identify the genetic basis of human disorders. Copy number variations (CNVs) contribute significantly to human genomic variability, some of which can lead to disease. However, effective detection of CNVs from targeted capture sequencing data remains challenging.ResultsHere we present SeqCNV, a novel CNV calling method designed to use capture NGS data. SeqCNV extracts the read depth information and utilizes the maximum penalized likelihood estimation (MPLE) model to identify the copy number ratio and CNV boundary. We applied SeqCNV to both bacterial artificial clone (BAC) and human patient NGS data to identify CNVs. These CNVs were validated by array comparative genomic hybridization (aCGH).ConclusionsSeqCNV is able to robustly identify CNVs of different size using capture NGS data. Compared with other CNV-calling methods, SeqCNV shows a significant improvement in both sensitivity and specificity.

Highlights

  • Targeted next-generation sequencing (NGS) has been widely used as a cost-effective way to identify the genetic basis of human disorders

  • Since copy number variation (CNV) boundaries are more likely to be located in introns or intergenic regions that are far from the targeted regions, many CNVs will be completely missed by paired-end mapping (PEM)-based approaches that use targeted capture sequencing data

  • We randomly generated four copy changes including two gains and two losses at different sizes of 1 MB, 100 KB, 10 KB and 1 KB, containing at least one captured exon. For each of these 16 changes per sample, we mimicked a single copy number gain or loss by increasing or decreasing the number of case reads by 50% relative to the control, respectively

Read more

Summary

Introduction

Targeted next-generation sequencing (NGS) has been widely used as a cost-effective way to identify the genetic basis of human disorders. Copy number variations (CNVs) contribute significantly to human genomic variability, some of which can lead to disease. The ability to identify and characterize genomic variants and mutations from large numbers of individuals has become feasible, driving advances in our understanding of genetic diseases. Due to the cost and the complexity of analyzing whole genome sequence data, targeted capture sequencing has become the predominant approach for genetic diagnostic purposes. Targeted capture sequencing yields significantly greater depth of coverage, providing increased quality and fidelity at a decreased cost compared with whole genome sequencing [1,2,3,4]. CNVs have played a pivotal role in evolutionary [16,17,18] and population genetics analysis [19, 20]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call