Incorporating functional inter-relationships into protein function prediction algorithms

Gaurav Pandey,Chad L Myers,Vipin Kumar

doi:10.1186/1471-2105-10-142

Gaurav Pandey, Chad L Myers + Show 1 more

Open Access

https://doi.org/10.1186/1471-2105-10-142

Copy DOI

Journal: BMC Bioinformatics	Publication Date: May 12, 2009
Citations: 117	License type: CC BY 2.0

Affiliation: University of Minnesota

Abstract

BackgroundFunctional classification schemes (e.g. the Gene Ontology) that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. While successful function prediction algorithms have been developed, few previous efforts have utilized more than the protein-to-functional class label information provided by such knowledge bases. For instance, the Gene Ontology not only captures protein annotations to a set of functional classes, but it also arranges these classes in a DAG-based hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches.ResultsWe propose a method to enhance the performance of classification-based protein function prediction algorithms by addressing the issue of using these interrelationships between functional classes constituting functional classification schemes. Using a standard measure for evaluating the semantic similarity between nodes in an ontology, we quantify and incorporate these inter-relationships into the k-nearest neighbor classifier. We present experiments on several large genomic data sets, each of which is used for the modeling and prediction of over hundred classes from the GO Biological Process ontology. The results show that this incorporation produces more accurate predictions for a large number of the functional classes considered, and also that the classes benefitted most by this approach are those containing the fewest members. In addition, we show how our proposed framework can be used for integrating information from the entire GO hierarchy for improving the accuracy of predictions made over a set of base classes. Finally, we provide qualitative and quantitative evidence that this incorporation of functional inter-relationships enables the discovery of interesting biology in the form of novel functional annotations for several yeast proteins, such as Sna4, Rtn1 and Lin1.ConclusionWe implemented and evaluated a methodology for incorporating interrelationships between functional classes into a standard classification-based protein function prediction algorithm. Our results show that this incorporation can help improve the accuracy of such algorithms, and help uncover novel biology in the form of previously unknown functional annotations. The complete source code, a sample data set and the additional files for this paper are available free of charge for non-commercial use at .

Highlights

Functional classification schemes that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction
We evaluate our algorithm on two large microarray data sets [13,15], a recent protein interaction data set [16] and a combination of interaction and microarray data sets, each of which is used for the modeling and prediction of over hundred classes from the Gene Ontology (GO) Biological Process ontology
We provide qualitative and quantitative evidence that this incorporation of functional inter-relationships enables the discovery of interesting biology in the form of novel functional annotations for several yeast proteins, such as Sna4, Rtn1 and Lin1

Summary

Introduction

Functional classification schemes (e.g. the Gene Ontology) that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. The Gene Ontology captures protein annotations to a set of functional classes, but it arranges these classes in a DAGbased hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches. The key premise underlying this methodology for predicting protein function is that proteins belonging to the same functional class have "similar" biological attributes

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Incorporating functional inter-relationships into protein function prediction algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

A Framework for Incorporating Functional Interrelationships into Protein Function Prediction Algorithms
Xiao-Fei Zhang ... Dao-Qing Dai
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 9
Xiao-Fei Zhang, et. al. Xiao-Fei Zhang ... Dao-Qing Dai
01 May 2012
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 9

New avenues in protein function prediction
Iddo Friedberg ... Martin Jambon
Protein Science | VOL. 15
Iddo Friedberg, et. al.Iddo Friedberg ... Martin Jambon
01 Jun 2006
Protein Science | VOL. 15

Defining functional distances over Gene Ontology
Angela Del Pozo ... Florencio Pazos
BMC Bioinformatics | VOL. 9
Angela Del Pozo, et. al.Angela Del Pozo ... Florencio Pazos
25 Jan 2008
BMC Bioinformatics | VOL. 9

Improving protein function prediction using protein sequence and GO-term similarities.
Stavros Makrodimitris ... Roeland C H J Van Ham
Bioinformatics | VOL. 35
Stavros Makrodimitris, et. al.Stavros Makrodimitris ... Roeland C H J Van Ham
29 Aug 2018
Bioinformatics | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Incorporating functional inter-relationships into protein function prediction algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics