Abstract

BackgroundAlthough different protein-protein physical interaction (PPI) datasets exist for Escherichia coli, no common methodology exists to integrate these datasets and extract reliable modules reflecting the existing biological process and protein complexes. Naïve Bayesian formula is the highly accepted method to integrate different PPI datasets into a single weighted PPI network, but detecting proper weights in such network is still a major problem.ResultsIn this paper, we proposed a new methodology to integrate various physical PPI datasets into a single weighted PPI network in a way that the detected modules in PPI network exhibit the highest similarity to available functional modules. We used the co-expression modules as functional modules, and we shown that direct functional modules detected from Gene Ontology terms could be used as an alternative dataset. After running this integrating methodology over six different physical PPI datasets, orthologous high-confidence interactions from a related organism and two AP-MS PPI datasets gained high weights in the integrated networks, while the weights for one AP-MS PPI dataset and two other datasets derived from public databases have converged to zero. The majority of detected modules shaped around one or few hub protein(s). Still, a large number of highly interacting protein modules were detected which are functionally relevant and are likely to construct protein complexes.ConclusionsWe provided a new high confidence protein complex prediction method supported by functional studies and literature mining.

Highlights

  • Different protein-protein physical interaction (PPI) datasets exist for Escherichia coli, no common methodology exists to integrate these datasets and extract reliable modules reflecting the existing biological process and protein complexes

  • In this study, we introduced a new methodology to integrate the available PPI datasets into a weighted PPI network in such a way that the detected modules exhibit the highest similarity to predefined sets of modules

  • We used co-expressed module to show that direct functional modules detected from Gene Ontology terms could lead to similar results

Read more

Summary

Introduction

Different protein-protein physical interaction (PPI) datasets exist for Escherichia coli, no common methodology exists to integrate these datasets and extract reliable modules reflecting the existing biological process and protein complexes. A natural way to represent protein-protein interactions relations passes through a network: each node corresponds to a protein and each undirected connection corresponds to a protein-protein interaction This protein-protein interaction (PPI) network can be unweighted in the case that all the interactions come from a reliable data source, or it can be weighted if a certain value of confidence would be assigned to each connection. Different clustering and classification methods have been applied over PPI networks and the results have been compared to known protein complexes and functional categories [3]. Naïve Bayesian is the most common classifier so far that has been applied over PPI network [1, 4,5,6]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.