Abstract

Development of high-throughput monitoring technologies enables interrogation of cancer samples at various levels of cellular activity. Capitalizing on these developments, various public efforts such as The Cancer Genome Atlas (TCGA) generate disparate omic data for large patient cohorts. As demonstrated by recent studies, these heterogeneous data sources provide the opportunity to gain insights into the molecular changes that drive cancer pathogenesis and progression. However, these insights are limited by the vast search space and as a result low statistical power to make new discoveries. In this paper, we propose methods for integrating disparate omic data using molecular interaction networks, with a view to gaining mechanistic insights into the relationship between molecular changes at different levels of cellular activity. Namely, we hypothesize that genes that play a role in cancer development and progression may be implicated by neither frequent mutation nor differential expression, and that network-based integration of mutation and differential expression data can reveal these “silent players”. For this purpose, we utilize network-propagation algorithms to simulate the information flow in the cell at a sample-specific resolution. We then use the propagated mutation and expression signals to identify genes that are not necessarily mutated or differentially expressed genes, but have an essential role in tumor development and patient outcome. We test the proposed method on breast cancer and glioblastoma multiforme data obtained from TCGA. Our results show that the proposed method can identify important proteins that are not readily revealed by molecular data, providing insights beyond what can be gleaned by analyzing different types of molecular data in isolation.

Highlights

  • The sequencing revolution of the last decade is producing vast amounts of data with clinical relevance

  • Due to limitations in proteomic and phosphoproteomic screening [12], the changes in those mediator proteins may not be readily detectable from genomic and transcriptomic data alone. We propose that such “silent” proteins can be detected by integrating mutation and differential expression data in a network context, since these proteins are likely to be in close proximity to both mutated and differentially expressed proteins in the network of protein-protein interactions (PPIs)

  • The input to our method consists of breast cancer (BRCA) and glioblastoma multiforme (GBM) data obtained from The Cancer Genome Atlas (TCGA) [13]

Read more

Summary

Introduction

The sequencing revolution of the last decade is producing vast amounts of data with clinical relevance. Translating these data to biomedical understanding remains a formidable challenge due to the typically low statistical power associated with sequencing studies, disease heterogeneity, experimental limitations and more. A promising strategy to circumvent some of these problems is the integration of sequence data with other types of “omic” data [1]. We aim to harness the network propagation methodology to the integration of multiple omic data types in the context of cancer, with a view to gaining mechanistic insights into the relationship between molecular changes at different levels of cellular activity

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call