Abstract

Carcinogenesis is a complex process with multiple genetic and environmental factors contributing to the development of one or more tumors. Understanding the underlying mechanism of this process and identifying related markers to assess the outcome of this process would lead to more directed treatment and thus significantly reduce the mortality rate of cancers. Recently, molecular diagnostics and prognostics based on the identification of patterns within gene expression profiles in the context of protein interaction networks were reported. However, the predictive performances of these approaches were limited. In this study we propose a novel integrated approach, named CAERUS, for the identification of gene signatures to predict cancer outcomes based on the domain interaction network in human proteome. We first developed a model to score each protein by quantifying the domain connections to its interacting partners and the somatic mutations present in the domain. We then defined proteins as gene signatures if their scores were above a preset threshold. Next, for each gene signature, we quantified the correlation of the expression levels between this gene signature and its neighboring proteins. The results of the quantification in each patient were then used to predict cancer outcome by a modified naïve Bayes classifier. In this study we achieved a favorable accuracy of 88.3%, sensitivity of 87.2%, and specificity of 88.9% on a set of well-documented gene expression profiles of 253 consecutive breast cancer patients with different outcomes. We also compiled a list of cancer-associated gene signatures and domains, which provided testable hypotheses for further experimental investigation. Our approach proved successful on different independent breast cancer data sets as well as an ovarian cancer data set. This study constitutes the first predictive method to classify cancer outcomes based on the relationship between the domain organization and protein network.

Highlights

  • Cancer development is a complex process driven by multiple genetic and environmental factors [1,2,3]

  • In order to identify a list of gene signatures and better predict cancer outcome, we developed an integrated and systematical approach by investigating gene expression profiling alternation caused by disruptions between protein-protein interactions and domain-domain interactions in the human interactome

  • Our approach achieves the favorable predictive performance if tested on a set of well-documented breast cancer patients, which suggests that the disrupted interactome is important to determine patient prognosis

Read more

Summary

Introduction

Cancer development is a complex process driven by multiple genetic and environmental factors [1,2,3]. Understanding the underlying mechanism of this process and identifying related markers to assess the outcome of this process could lead to better management and treatment of this complex disease. The majority of breast cancer patients are currently over-treated [4] due to the lack of accurate assessment of the risk of metastasis. A substantial proportion of patients are receiving the otherwise avoidable aggressive adjuvant therapy in accordance to the current guidelines [5]. The importance of identifying prognostic signatures that could accurately predict cancer outcomes is widely appreciated, it has remained a challenging task. With the emergence of large amounts of DNA microarraybased tumor gene expression profiles, molecular diagnostics and prognostics have begun to provide solutions to this challenge [6]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call