Abstract

BackgroundProtein complexes formed by non-covalent interaction among proteins play important roles in cellular functions. Computational and purification methods have been used to identify many protein complexes and their cellular functions. However, their roles in terms of causing disease have not been well discovered yet. There exist only a few studies for the identification of disease-associated protein complexes. However, they mostly utilize complicated heterogeneous networks which are constructed based on an out-of-date database of phenotype similarity network collected from literature. In addition, they only apply for diseases for which tissue-specific data exist.MethodsIn this study, we propose a method to identify novel disease-protein complex associations. First, we introduce a framework to construct functional similarity protein complex networks where two protein complexes are functionally connected by either shared protein elements, shared annotating GO terms or based on protein interactions between elements in each protein complex. Second, we propose a simple but effective neighborhood-based algorithm, which yields a local similarity measure, to rank disease candidate protein complexes.ResultsComparing the predictive performance of our proposed algorithm with that of two state-of-the-art network propagation algorithms including one we used in our previous study, we found that it performed statistically significantly better than that of these two algorithms for all the constructed functional similarity protein complex networks. In addition, it ran about 32 times faster than these two algorithms. Moreover, our proposed method always achieved high performance in terms of AUC values irrespective of the ways to construct the functional similarity protein complex networks and the used algorithms. The performance of our method was also higher than that reported in some existing methods which were based on complicated heterogeneous networks. Finally, we also tested our method with prostate cancer and selected the top 100 highly ranked candidate protein complexes. Interestingly, 69 of them were evidenced since at least one of their protein elements are known to be associated with prostate cancer.ConclusionsOur proposed method, including the framework to construct functional similarity protein complex networks and the neighborhood-based algorithm on these networks, could be used for identification of novel disease-protein complex associations.Electronic supplementary materialThe online version of this article (doi:10.1186/s13015-015-0044-6) contains supplementary material, which is available to authorized users.

Highlights

  • Protein complexes formed by non-covalent interaction among proteins play important roles in cellular functions

  • A protein complex of SCRIB, NOS1AP and VANGL1 is associated with breast cancer progression [10], TWIST/Mi2/NuRD protein complex has an essential role in cancer metastasis [11], aberrant protein complex consisting of prostaglandind-synthase (PDS) and transthyretin (TTR) is a biomarker of Alzheimer’s disease [12]

  • Comparing to our previous study [21], which used Random Walk with Restart (RWR) algorithm on a functional similarity protein complex network built based on shared protein elements, in this study, we presented a framework to construct functional similarity protein complex networks based on shared protein elements, and shared annotating gene ontology (GO) terms and protein interactions

Read more

Summary

Introduction

Protein complexes formed by non-covalent interaction among proteins play important roles in cellular functions. There exist only a few studies for the identification of disease-associated protein complexes They mostly utilize complicated heterogeneous networks which are constructed based on an out-of-date database of phenotype similarity network collected from literature. A network propagation algorithm was applied on the heterogeneous network to prioritize candidate genes They reported that their method outperformed other methods which were solely based on the human protein interaction network and phenotype similarity network for the prediction of novel disease phenotype-gene associations [17,18]. The phenotype similarity network was collected from a relatively old published study [23]; it is not up-todate They were limited in the prediction of disease of which a tissue-specific protein interaction network [19] or gene expression data [22] exist. The functional similarity between protein complexes in the constructed protein complex network was only based on their shared protein elements

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call