An important aspect of cancer progression concerns the way in which gene mutations accumulate in cellular lineages. Comprehensive efforts into cataloging cancer genes have revealed that tumors demonstrate variability in genes that accumulate mutations which depend on the presence or absence of other mutations. However, understanding the stochastic process by which mutations arise across the genome is an important open problem of this nature in biology due to modeling discrete variate time-series is the most challenging, and, as yet, least well-developed of all areas of research in time-series. In this paper, a DEGBOE framework is proposed to model the mutation time-series given the sequence data of the gene mutations. The method relates the discrete-time, nonlinear and nonstationary series of gene mutations to the time-varying autoregressive moving average model. It presents the observation as a nonlinear function dependent on two variables: gene mutation, and gene-gene interactions characterizing the effects of the varying presence or absence of other gene mutations on a mutations’ occurrence and evolution. DEGBOE is applied to model the dynamics of frequently mutated genes in lung cancer, includingEGFR,KRAS, and TP53. The results of the model are analyzed and compared to the original simulated data of theDNAwalk, and experimental lung cancer mutations data. It identifies the driver role of TP53 mutations in lung cancer progression.
Read full abstract