Abstract
BackgroundCharacterization of unknown proteins through computational approaches is one of the most challenging problems in silico biology, which has attracted world-wide interests and great efforts. There have been some computational methods proposed to address this problem, which are either based on homology mapping or in the context of protein interaction networks.ResultsIn this paper, two algorithms are proposed by integrating the protein-protein interaction (PPI) network, proteins’ domain information and protein complexes. The one is domain combination similarity (DCS), which combines the domain compositions of both proteins and their neighbors. The other is domain combination similarity in context of protein complexes (DSCP), which extends the protein functional similarity definition of DCS by combining the domain compositions of both proteins and the complexes including them. The new algorithms are tested on networks of the model species of Saccharomyces cerevisiae to predict functions of unknown proteins using cross validations. Comparing with other several existing algorithms, the results have demonstrated the effectiveness of our proposed methods in protein function prediction. Furthermore, the algorithm DSCP using experimental determined complex data is robust when a large percentage of the proteins in the network is unknown, and it outperforms DCS and other several existing algorithms.ConclusionsThe accuracy of predicting protein function can be improved by integrating the protein-protein interaction (PPI) network, proteins’ domain information and protein complexes.
Highlights
Characterization of unknown proteins through computational approaches is one of the most challenging problems in silico biology, which has attracted world-wide interests and great efforts
With respect to the above issues, we propose a new algorithm by defining a domain combination similarity in protein-protein interaction (PPI) networks as a measurement of the protein function similarity, named by DCS
Domain combination similarity in context of protein complexes (DSCP) We argue that the original manner of taking neighbors as the domain context in DCS can be further improved by using the protein complexes information instead
Summary
Characterization of unknown proteins through computational approaches is one of the most challenging problems in silico biology, which has attracted world-wide interests and great efforts. There have been some computational methods proposed to address this problem, which are either based on homology mapping or in the context of protein interaction networks. The function annotation of a protein is an important challenge in post-genomics due to the critical roles of proteins in various biological processes. It is expensive and time-consuming to experimentally determine protein functions. Predicting protein function is based on the idea that assigning functions to unknown proteins proteins as the data of sequenced proteins continue to expand at the exponential rate. Existing computational methods based on PPI can be roughly divided into two main categories: direct methods that straightforwardly utilize the protein interactions and module-assisted schemes that use function modules to infer protein functions as a whole [9]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.