Predicting protein function via downward random walks on a gene ontology.

Guoxian Yu,Jiming Liu,Hailong Zhu,Carlotta Domeniconi

doi:10.1186/s12859-015-0713-y

Abstract

BackgroundHigh-throughput bio-techniques accumulate ever-increasing amount of genomic and proteomic data. These data are far from being functionally characterized, despite the advances in gene (or gene’s product proteins) functional annotations. Due to experimental techniques and to the research bias in biology, the regularly updated functional annotation databases, i.e., the Gene Ontology (GO), are far from being complete. Given the importance of protein functions for biological studies and drug design, proteins should be more comprehensively and precisely annotated.ResultsWe proposed downward Random Walks (dRW) to predict missing (or new) functions of partially annotated proteins. Particularly, we apply downward random walks with restart on the GO directed acyclic graph, along with the available functions of a protein, to estimate the probability of missing functions. To further boost the prediction accuracy, we extend dRW to dRW-kNN. dRW-kNN computes the semantic similarity between proteins based on the functional annotations of proteins; it then predicts functions based on the functions estimated by dRW, together with the functions associated with the k nearest proteins. Our proposed models can predict two kinds of missing functions: (i) the ones that are missing for a protein but associated with other proteins of interest; (ii) the ones that are not available for any protein of interest, but exist in the GO hierarchy. Experimental results on the proteins of Yeast and Human show that dRW and dRW-kNN can replenish functions more accurately than other related approaches, especially for sparse functions associated with no more than 10 proteins.ConclusionThe empirical study shows that the semantic similarity between GO terms and the ontology hierarchy play important roles in predicting protein function. The proposed dRW and dRW-kNN can serve as tools for replenishing functions of partially annotated proteins.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0713-y) contains supplementary material, which is available to authorized users.

Highlights

High-throughput bio-techniques accumulate ever-increasing amount of genomic and proteomic data
The Gene Ontology Annotation (GOA) files of Yeast and Human were obtained from the European Bioinformatics Institute2
An interesting observation is that the average number of terms associated with a protein is close to the standard deviation; this is because some proteins in the GOA are not annotated with any term

Summary

Introduction

High-throughput bio-techniques accumulate ever-increasing amount of genomic and proteomic data. These data are far from being functionally characterized, despite the advances in gene (or gene’s product proteins) functional annotations. Due to experimental techniques and to the research bias in biology, the regularly updated functional annotation databases, i.e., the Gene Ontology (GO), are far from being complete. The Gene Ontology (GO) is a controlled vocabulary of terms for describing the biological roles of genes and their products (i.e., proteins) [1]. The advance in protein functional annotation far lags behind the pace of accumulated proteomic and genomic data. Schones et al [4] found that the functional annotations of high-throughput genomic and proteomic data are biased and shallow. Automatically annotating the functional roles of these proteins using GO

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Aug 27, 2015
Citations: 59	License type: cc-by

R Discovery Prime

R Discovery Prime

Predicting protein function via downward random walks on a gene ontology.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

New avenues in protein function prediction
Iddo Friedberg ... Martin Jambon
Protein Science | VOL. 15
Iddo Friedberg, et. al.Iddo Friedberg ... Martin Jambon
01 Jun 2006
Protein Science | VOL. 15

DeepGOA: Predicting Gene Ontology Annotations of Proteins via Graph Convolutional Network
Guangjie Zhou ... Guoxian Yu
-
Guangjie Zhou, et. al.Guangjie Zhou ... Guoxian Yu
01 Nov 2019
01 Nov 2019

Improving protein function prediction by learning and integrating representations of protein sequences and function labels.
Frimpong Boadu ... Jianlin Cheng
Bioinformatics advances | VOL. 4
Frimpong Boadu, et. al.Frimpong Boadu ... Jianlin Cheng
17 Aug 2024
Bioinformatics advances | VOL. 4

A Deep Learning Framework for Predicting Protein Functions With Co-Occurrence of GO Terms.
Min Li ... Fuhao Zhang
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 20
Min Li, et. al.Min Li ... Fuhao Zhang
01 Mar 2023
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting protein function via downward random walks on a gene ontology.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics