Abstract

Identifying cooperative driver modules from large-scale cancer omics data is one of the key topics in bioinformatics. Current methods usually only focus on discovering driver modules from batch omics data and ignore the cancer heterogeneity on the cell level, which are venerable to batch-level noises. To overcome these limitations, we propose a cooperative driver module identification method (CDMFinder) based on single-cell data and prior knowledge. CDMFinder first utilizes the gene co-expression specificity of different cancer subtypes and normal cell expression data to construct expression association networks and fuses these networks with a gene interaction network to obtain a gene functional association network. Thus, it effectively reduces network complexity while capturing in-depth functional associations between genes. It then adopts an overlapping Markov clustering on this functional network to mine functional clusters, along with a greedy strategy with a hybrid weight function to identify driver modules from the clusters. Finally, it introduces an interaction and mutation co-occurrence-based distance function on driver module sets to identify cooperative driver modules. CDMFinder fully integrates a variety of genetic factors (i.e., expression, mutation, and subtype specificity) and manifests a prominent performance. Experimental results on the multi-omics data of breast cancer and glioblastoma show that the number of driver genes identified by CDMFinder is 1.35 times larger than that of competitive methods. The identified cooperative driver modules are enriched at the pathway/functional level and are over 1.5 times more than compared methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call