Abstract

Comparison of read depths from next-generation sequencing between cancer and normal cells makes the estimation of copy number alteration (CNA) possible, even at very low coverage. However, estimating CNA from patients' tumour samples poses considerable challenges due to infiltration with normal cells and aneuploid cancer genomes. Here we provide a method that corrects contamination with normal cells and adjusts for genomes of different sizes so that the actual copy number of each region can be estimated. The procedure consists of several steps. First, we identify the multi-modality of the distribution of smoothed ratios. Then we use the estimates of the mean (modes) to identify underlying ploidy and the contamination level, and finally we perform the correction. The results indicate that the method works properly to estimate genomic regions with gains and losses in a range of simulated data as well as in two datasets from lung cancer patients. It also proves a powerful tool when analysing publicly available data from two cell lines (HCC1143 and COLO829). An R package, called CNAnorm, is available at http://www.precancer.leeds.ac.uk/cnanorm or from Bioconductor. a.gusnanto@leeds.ac.uk Supplementary data are available at Bioinformatics online.

Highlights

  • Cancer cells often exhibit severe karyotypic alterations: whole chromosome gain or loss and structural rearrangements such as amplifications, deletions and translocations result in widespread aneuploidy (Hartwell and Kastan, 1994)

  • We showed how the method performs with a range of simulated data, on low coverage data from patients’ samples and on higher coverage data from cell lines

  • We acknowledge that the problem could lead to several valid solutions, but provide an easy way for the user to correct the estimation from CNAnorm when independent clues are available

Read more

Summary

Introduction

Cancer cells often exhibit severe karyotypic alterations: whole chromosome gain or loss and structural rearrangements such as amplifications, deletions and translocations result in widespread aneuploidy (Hartwell and Kastan, 1994). The ability to detect copy number alterations (CNAs) of cancer cells is a crucial step to access the severity of chromosomal rearrangements and to find chromosomal regions where breakpoints are located. Several methodologies are available to detect CNAs. Comparative genomic hybridization (CGH) (Kallioniemi et al, 1992), array CGH (aCGH) (Pinkel et al., 1998), single nucleotide polymorphism (SNP) array (Bignell et al, 2004) and, more recently, a new generation of sequencing machines enabled massively parallel sequencing (Roche 454, Illumina GAII, HiSeq, MiSeq, ABI SOLiD, Ion Torrent PGM), making it possible to sequence full genomes at affordable cost. We previously showed (Wood et al, 2010) how it is possible to multiplex several samples in one Illumina GAII lane making copy number analysis by sequencing affordable and competitive with aCGH or SNP arrays. As we expect sequencing technologies to become more widespread, affordable and accurate, copy number analysis by low coverage sequencing will become even more convenient and informative. Sequencing is possible even with low amounts of DNA extracted from formalin-fixed paraffin-embedded specimens (Wood et al, 2010)

Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.