Two novel models and a parthenogenetic algorithm for detecting common driver pathways from pan-cancer data

Jingli Wu,Ke Pan,Gaoshi Li,Kai Zhu,Qirong Cai

doi:10.1016/j.engappai.2020.104010

Abstract

With the rapid development of high-throughput sequencing technologies, huge volumes of generated cancer genomics data make it into reality to understand the carcinogenic pathogenesis from the molecular level. It is believed that the study of commonalities among different cancers is one of the significant problems for understanding cancers, and will be beneficial for personalized therapy and precision medicine in cancer treatment. The ComMDP method is a useful one for solving this problem. However, when there is a substantially difference among the number of samples, the method of accumulating the absolute weight value of every cancer, employed by the ComMDP method, may give rise to missing some driver pathways. In this paper, two mathematical models CDP-V and CDP-H, replacing the absolute weight values with relative ones, are presented by using variance and harmonic mean, respectively. By devising a sort of short chromosome code and a greedy based recombination operator, a parthenogenetic algorithm is proposed for solving these two models. Extensive experiments were performed on both simulated and real cancer data. The experimental results show that given several types of cancer, the gene sets identified based on the presented models and algorithm not only mutate in a large proportion of samples of these cancers, but have close proportion of mutated samples in each cancer. In addition, some biologically meaningful gene sets, which are missed by the ComMDP one, are indeed detected. Hence the identified methods based on the presented models and algorithm may become useful complementary tools for identifying cancer pathways.

Full Text