Abstract

BackgroundEffectively predicting protein complexes not only helps to understand the structures and functions of proteins and their complexes, but also is useful for diagnosing disease and developing new drugs. Up to now, many methods have been developed to detect complexes by mining dense subgraphs from static protein-protein interaction (PPI) networks, while ignoring the value of other biological information and the dynamic properties of cellular systems.ResultsIn this paper, based on our previous works CPredictor and CPredictor2.0, we present a new method for predicting complexes from PPI networks with both gene expression data and protein functional annotations, which is called CPredictor3.0. This new method follows the viewpoint that proteins in the same complex should roughly have similar functions and are active at the same time and place in cellular systems. We first detect active proteins by using gene express data of different time points and cluster proteins by using gene ontology (GO) functional annotations, respectively. Then, for each time point, we do set intersections with one set corresponding to active proteins generated from expression data and the other set corresponding to a protein cluster generated from functional annotations. Each resulting unique set indicates a cluster of proteins that have similar function(s) and are active at that time point. Following that, we map each cluster of active proteins of similar function onto a static PPI network, and get a series of induced connected subgraphs. We treat these subgraphs as candidate complexes. Finally, by expanding and merging these candidate complexes, the predicted complexes are obtained.We evaluate CPredictor3.0 and compare it with a number of existing methods on several PPI networks and benchmarking complex datasets. The experimental results show that CPredictor3.0 achieves the highest F1-measure, which indicates that CPredictor3.0 outperforms these existing method in overall.ConclusionCPredictor3.0 can serve as a promising tool of protein complex prediction.

Highlights

  • Predicting protein complexes helps to understand the structures and functions of proteins and their complexes, and is useful for diagnosing disease and developing new drugs

  • We compute protein clusters of similar function(s) and being active at the same time point by set intersection operation with one set corresponding to an active protein set generated by expression data and the other set corresponding to a protein cluster generated from functional annotations

  • We present a brief survey on the related works by roughly classifying the existing methods into the following types: methods based on local dense subgraphs, methods based on the CoreAttachment Model, methods based on dynamic protein-protein interaction (PPI) networks, methods based on supervised learning

Read more

Summary

Introduction

Predicting protein complexes helps to understand the structures and functions of proteins and their complexes, and is useful for diagnosing disease and developing new drugs. Most proteins do not perform biological functions alone, but form protein complexes with others [1]. To have a more comprehensive and deep understanding of cell compositions and life processes, the identification of protein complexes is very important. Biological techniques such as Tandem Affinity Purification with Mass Spectrometry (TAP-MS) [2] can detect protein complex directly, the accuracy is not high. Biological techniques are usually timeconsuming and costly. These make biological techniques cannot meet the requirement of post-genome research for handling big biological data

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call