Abstract
The task of extracting the maximal amount of information from a biological network has drawn much attention from researchers, for example, predicting the function of a protein from a protein-protein interaction (PPI) network. It is well known that biological networks consist of modules/communities, a set of nodes that are more densely inter-connected among themselves than with the rest of the network. However, practical applications of utilizing the community information have been rather limited. For protein function prediction on a network, it has been shown that none of the existing community-based protein function prediction methods outperform a simple neighbor-based method. Recently, we have shown that proper utilization of a highly optimal modularity community structure for protein function prediction can outperform neighbor-assisted methods. In this study, we propose two function prediction approaches on bipartite networks that consider the community structure information as well as the neighbor information from the network: 1) a simple screening method and 2) a random forest based method. We demonstrate that our community-assisted methods outperform neighbor-assisted methods and the random forest method yields the best performance. In addition, we show that using the optimal community structure information is essential for more accurate function prediction for the protein-complex bipartite network of Saccharomyces cerevisiae. Community detection can be carried out either using a modified modularity for dealing with the original bipartite network or first projecting the network into a single-mode network (i.e., PPI network) and then applying community detection to the reduced network. We find that the projection leads to the loss of information in a significant way. Since our prediction methods rely only on the network topology, they can be applied to various fields where an efficient network-based analysis is required.
Highlights
Recent revolutionary advances in protein sequencing technology have made the proteome-scale protein-protein interaction (PPI) data of many species available
We address the above issues by performing community-assisted function predictions of the protein-complex network of Saccharomyces cerevisiae [18], a bipartite network consisting of proteins and complexes, with MIPS function annotations [21]
We find that projecting the original bipartite network of a protein-complexome network into a PPI network leads to the loss of information in a significant way
Summary
Recent revolutionary advances in protein sequencing technology have made the proteome-scale protein-protein interaction (PPI) data of many species available. The result shows that the efficiency of the function prediction can be improved significantly and maximally by using the community-assisted methods with the highest modularity community structure. For the community detection of the yeast protein-complex network with modularity optimization, we used the conformational space annealing (CSA) and simulated annealing (SA) methods [28].
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have