Abstract

Tumor growth is an evolutionary process involving accumulation of mutations, copy number alterations, and cancer stem cell (CSC) division and differentiation. As direct observation of this process is impossible, inference regarding when mutations occur and how stem cells divide is difficult. However, this ancestral information is encoded within the tumor itself, in the form of intratumoral heterogeneity of the tumor cell genomes. Here we present a framework that allows simulation of these processes and estimation of mutation rates at the various stages of tumor development and CSC division patterns for single-gland sequencing data from colorectal tumors. We parameterize the mutation rate and the CSC division pattern, and successfully retrieve their posterior distributions based on DNA sequence level data. Our approach exploits Approximate Bayesian Computation (ABC), a method that is becoming widely-used for problems of ancestral inference.

Highlights

  • In this paper we propose a simulation-based method that can be used to estimate both the mutation rate and asymmetric division rate of cancer stem cell (CSC), and can in addition infer whether a mutation burst occurred in that tumor

  • Since mutation rates are relatively low, we model the number of DNA mutations, n, introduced into each daughter cell according to a Poisson distribution, the mean of which is referred to as the mutation rate

  • For each simulated dataset we modeled tumor growth and sampled 6 glands from each half of the resulting tumor

Read more

Summary

Methods

The ABC version of rejection sampling is as follows: For i = 1 to N Sample parameters θ’ from the prior distribution π(θ) Simulate data D’ using the tumor growth model described earlier with the sampled parameters θ’, and summarize D’ as S’. A number of methods have been invented to choose a concise set of summary statistics, ensuring that they maintain informativeness with regard to inferring posterior distributions for model parameters [25,26,27,28,29]. Some summary statistics sà might be more informative for a particular parameter θà than others, a higher weight on sà will help infer the posterior distribution of θà [25,30]. Including the weights of each summary statistic, the distance metric is defined as: dðS’; SÞ 1⁄4 jjðS’ À SÞWTjj: ð2Þ

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call