Abstract

Abstract In an effort to improve somatic mutation calling pipelines, the Human Genome Sequencing Center at Baylor College of Medicine in collaboration with The Cancer Genome Atlas has undertaken a comprehensive sequencing and comparative analysis of mutation calling in the context of paired tumor/normal samples. Comparative evaluation of somatic mutation calling of single nucleotide variants (SNVs) and structural variants (SVs) has been performed using two tumor/normal paired breast cancer cell lines, HCC1954 and HCC1143. Through controlled in silico mixing of the tumor/normal and alternating cell line sequencing data, the effects of tumor purity (i.e. normal contamination) and subclonal expansion on the sensitivity and specificity of somatic mutation calling has been comprehensively evaluated. In addition, cross sequencing-platform comparisons have been assessed, evaluating the cost/benefits of including long range information for structural variation analysis; for example, PacBio long-read or large-fragment mate pair sequencing libraries, and RNAseq gene expression information for association of functional consequences due to somatic mutation. A full spectrum of sequencing data has been generated and undergone SNV and SV somatic mutation calling and integrative analysis; including: whole genome sequencing (WGS) from Illumina short paired-end (250X-coverge), Illumina long mate-pair (10X-coverage), and PacBio long read (5X-coverage). For the short paired-end WGS data, combinations of various fractions of tumor-derived and normal-derived cell line DNA sequence were used to simulate tumor purity. Similarly, additions of varying fractions of known spike-in SNVs and SVs were used to simulate subclonal expansions on a heterogeneous tumor genome background. This benchmarking study provides novel insights on the benefits of incorporating long-range sequencing data; such as: Illumina 3Kbp mate-pair and PacBio long read (9Kbp N50 read-length), allowing for orthogonal sequencing library and cross-platform confirmation, as well as independent identification of somatic SV mutations. Furthermore, incorporation of tumor/normal RNAseq data allows integration of gene expression and fusion transcripts thereby associating functional consequences with specific somatic mutations. Lastly, comparison of copy number alterations from paired-end WGS read-depth and single nucleotide polymorphism (SNP) array data were are comparatively evaluated. This comprehensive comparative evaluation and integrative analysis of somatic SNV and SV calling across a full spectrum of sequencing data provides essential feedback that is useful for improving somatic mutation calling pipelines and aiding in the design of future whole genome sequencing cancer studies. Citation Format: Oliver A. Hampton, Richard A. Gibbs, David A. Wheeler. Comparative evaluation of somatic mutations calls on single nucleotide variants and structural variants using breast cancer cell lines. [abstract]. In: Proceedings of the 105th Annual Meeting of the American Association for Cancer Research; 2014 Apr 5-9; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2014;74(19 Suppl):Abstract nr 2370. doi:10.1158/1538-7445.AM2014-2370

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call