Abstract

Currently, the population dynamics of preclonal cancer cells before clonal expansion of tumors has not been sufficiently addressed thus far. By focusing on preclonal cancer cell population as a Darwinian evolutionary system, we formulated and analyzed the observed mutation frequency among tumors (MFaT) as a proxy for the hypothesized sequence read frequency and beneficial fitness effect of a cancer driver mutation. Analogous to intestinal crypts, we assumed that sample donor patients are separate culture tanks where proliferating cells follow certain population dynamics described by extreme value theory (EVT). To validate this, we analyzed three large-scale cancer genome datasets, each harboring > 10000 tumor samples and in total involving > 177898 observed mutation sites. We clarified the necessary premises for the application of EVT in the strong selection and weak mutation (SSWM) regime in relation to cancer genome sequences at scale. We also confirmed that the stochastic distribution of MFaT is likely of the Fréchet type, which challenges the well-known Gumbel hypothesis of beneficial fitness effects. Based on statistical data analysis, we demonstrated the potential of EVT as a population genetics framework to understand and explain the stochastic behavior of driver-mutation frequency in cancer genomes as well as its applicability in real cancer genome sequence data.

Highlights

  • The “Big Bang” model of cancer development and population genetics of cancer cellsTo deconvolve complex biology of cancer, it is useful to trace the temporal order of population dynamics events of cancer cells as well as the underlying somatic genetic events [1]

  • A visual inspection of the distribution shape using a density plot (Fig 2A and S2A Fig) suggested that the probability distribution of cancer driver mutation mutant allele frequency among tumors (MFaT) is approximately equal to the extreme value distribution

  • A Q-Q plot with Sugano normalization confirmed that the driver MFaT distribution is approximately described by the extreme value distribution given the relationship between observed and theoretical values (Fig 2B, S2B Fig)

Read more

Summary

Introduction

The “Big Bang” model of cancer development and population genetics of cancer cells. To deconvolve complex biology of cancer, it is useful to trace the temporal order of population dynamics events of cancer cells as well as the underlying somatic genetic events [1]. A long mutation and selection process precedes a rapid population increase that results in a clonal expansion which will be observed as formation of a tumor. Extreme value theory as a framework for understanding mutation frequency distribution in cancer genome. COSMIC: https://cancer.sanger.ac.uk/cosmic CHANG: https://github.com/taylor-lab/hotspots/ blob/master/LINK_TO_MUTATIONAL_DATA RTCGA: https://rtcga.github.io/RTCGA/ IntOGen: https://www.intogen.org/search DoCM: http:// docm.info

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call