Abstract
BackgroundCancer progression is associated with genomic instability and an accumulation of gains and losses of DNA. The growing variety of tools for measuring genomic copy numbers, including various types of array-CGH, SNP arrays and high-throughput sequencing, calls for a coherent framework offering unified and consistent handling of single- and multi-track segmentation problems. In addition, there is a demand for highly computationally efficient segmentation algorithms, due to the emergence of very high density scans of copy number.ResultsA comprehensive Bioconductor package for copy number analysis is presented. The package offers a unified framework for single sample, multi-sample and multi-track segmentation and is based on statistically sound penalized least squares principles. Conditional on the number of breakpoints, the estimates are optimal in the least squares sense. A novel and computationally highly efficient algorithm is proposed that utilizes vector-based operations in R. Three case studies are presented.ConclusionsThe R package copynumber is a software suite for segmentation of single- and multi-track copy number data using algorithms based on coherent least squares principles.
Highlights
Cancer progression is associated with genomic instability and an accumulation of gains and losses of DNA
Genome-wide scans of copy number alterations may be obtained with array-based comparative genomic hybridization, Single-nucleotide Polymorphism (SNP) arrays and high-throughput sequencing (HTS)
Selection of penalty The selection of parameters determining the trade-off between high sensitivity and high specificity is important in all segmentation procedures
Summary
A comprehensive Bioconductor package for copy number analysis is presented. The package offers a unified framework for single sample, multi-sample and multi-track segmentation and is based on statistically sound penalized least squares principles. Conditional on the number of breakpoints, the estimates are optimal in the least squares sense. A novel and computationally highly efficient algorithm is proposed that utilizes vector-based operations in R.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have