Abstract
BackgroundVariations in DNA copy number have an important contribution to the development of several diseases, including autism, schizophrenia and cancer. Single-cell sequencing technology allows the dissection of genomic heterogeneity at the single-cell level, thereby providing important evolutionary information about cancer cells. In contrast to traditional bulk sequencing, single-cell sequencing requires the amplification of the whole genome of a single cell to accumulate enough samples for sequencing. However, the amplification process inevitably introduces amplification bias, resulting in an over-dispersing portion of the sequencing data. Recent study has manifested that the over-dispersed portion of the single-cell sequencing data could be well modelled by negative binomial distributions.ResultsWe developed a read-depth based method, nbCNV to detect the copy number variants (CNVs). The nbCNV method uses two constraints-sparsity and smoothness to fit the CNV patterns under the assumption that the read signals are negatively binomially distributed. The problem of CNV detection was formulated as a quadratic optimization problem, and was solved by an efficient numerical solution based on the classical alternating direction minimization method.ConclusionsExtensive experiments to compare nbCNV with existing benchmark models were conducted on both simulated data and empirical single-cell sequencing data. The results of those experiments demonstrate that nbCNV achieves superior performance and high robustness for the detection of CNVs in single-cell sequencing data.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1239-7) contains supplementary material, which is available to authorized users.
Highlights
Variations in DNA copy number have an important contribution to the development of several diseases, including autism, schizophrenia and cancer
Simulation experiments To evaluate the performance of nbCNV, experiments on a simulation dataset from a chromosome sequence with
Parameter pruning The dispersion parameter α is associated with the negative binomial distributions of the different copy number (CN) states
Summary
Variations in DNA copy number have an important contribution to the development of several diseases, including autism, schizophrenia and cancer. Existing WGA techniques, such as degenerate oligonucleotide primed-polymerase chain reaction [30], multiple displacement amplification [17] and multiple annealing looping-based amplification cycling [36], inevitably introduce amplification bias to varying degrees when the whole genome of a single cell is amplified to microgram levels for NGS [13, 28].
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have