Abstract

Abstract We have developed a new method that uses high-throughput reads that span multiple somatic point mutations to reconstruct multiple, genetically diverse subclonal populations from one or more heterogeneous tumor samples. Tumors often contain multiple, genetically diverse subclonal populations, as predicted by the clonal theory of cancer. These subclonal populations develop through successive waves of expansion and selection and have differing abilities to metastasize and resist treatment. Identifying these sub-populations and their evolutionary relationships can help identify driver mutations associated with cancer development and progression. Subclonal reconstruction algorithms attempt to infer the prevalence and genotype of multiple, genetically-related subclonal populations using the variant allele frequency (VAF) of somatic variants. To date, these algorithms exclusively use data on individual somatic mutations. This restriction greatly reduces their ability to fully resolve phylogenic ambiguities. In some cases, it is possible to determine the mutation status of >1 mutation in a single cell, for example, when single reads cover multiple single nucleotide variants (SNVs). This type of information can add considerable power to the phylogenetic reconstruction of the tumor subclonal population. We have developed the PhyloSpan algorithm which attempts to infer the states of multiple SNVs in single cells, and then exploits that information in subclonal reconstruction. Our algorithm starts with phasing somatic SNVs by looking for reads / read-pairs that cover both a somatic mutation and germline heterozygous single nucleotide polymorphism (SNP). These germline SNPs are often available through profiling of normal tissue. PhyloSpan then identifies SNVs that are on the same chromosome and close enough to be covered by a single read or paired reads. These pairs of mutations provide more phylogenetic certainty than can be found by looking at mutations independently. For example, if those SNVs are found in the same evolutionary branch, then we expect to see some reads containing both mutations. If however, the SNVs are an separate branches then no reads should show both SNVs. PhyloSpan integrates this phylogenetic information, along with information about the VAF of each somatic SNV in order to perform subclonal reconstruction. Incorporating these various types of information, especially given the substantial uncertainty in phasing and NGS read content, requires a rigorous statistical approach and so we have developed a Bayesian non-parametric tree-based clustering algorithm, based on our existing PhyloWGS method. This algorithm not only infers the number of subclonal populations and their genotype but also provides a measure of uncertainty about this inference, enabling users to determine which parts of the subclonal reconstruction are certain and which parts remain ambiguous. While the number of SNVs a short-read length distance away from another SNV is small, a handful of such pairs are all that is needed to eliminate a substantial amount of ambiguity in subclonal reconstruction. Furthermore, long (>10k) read technologies, such as PacBio, can be used to supplement short read sequence. Our approach generalizes to permit the integration of single-cell sequencing with bulk tumor sequencing. Furthermore, we can also use our framework to identify a small number of SNVs for which low throughput assays would be most useful to resolve subclonal reconstruction ambiguity. We will present results applying our algorithm to whole genome sequencing data showing the added value of considering multiple SNVs compared to independent SNVs. Citation Format: Amit G. Deshwar, Levi Boyles, Jeff Wintersinger, Paul C. Boutros, Yee Whye Teh, Quaid Morris, Quaid Morris. PhyloSpan: Using multi-mutation reads to resolve subclonal architectures from heterogeneous tumor samples. [abstract]. In: Proceedings of the AACR Special Conference on Computational and Systems Biology of Cancer; Feb 8-11 2015; San Francisco, CA. Philadelphia (PA): AACR; Cancer Res 2015;75(22 Suppl 2):Abstract nr B2-59.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call