Protein function prediction as approximate semantic entailment

Maxat Kulmanov,Francisco J Guzmán-Vega,Paula Duek Roggli,Lydie Lane,Stefan T Arold,Robert Hoehndorf

doi:10.1038/s42256-024-00795-w

Abstract

The Gene Ontology (GO) is a formal, axiomatic theory with over 100,000 axioms that describe the molecular functions, biological processes and cellular locations of proteins in three subontologies. Predicting the functions of proteins using the GO requires both learning and reasoning capabilities in order to maintain consistency and exploit the background knowledge in the GO. Many methods have been developed to automatically predict protein functions, but effectively exploiting all the axioms in the GO for knowledge-enhanced learning has remained a challenge. We have developed DeepGO-SE, a method that predicts GO functions from protein sequences using a pretrained large language model. DeepGO-SE generates multiple approximate models of GO, and a neural network predicts the truth values of statements about protein functions in these approximate models. We aggregate the truth values over multiple models so that DeepGO-SE approximates semantic entailment when predicting protein functions. We show, using several benchmarks, that the approach effectively exploits background knowledge in the GO and improves protein function prediction compared to state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature Machine Intelligence	Publication Date: Feb 1, 2024
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Protein function prediction as approximate semantic entailment

Abstract

Talk to us

Similar Papers

More From: Nature Machine Intelligence

Lead the way for us

Similar Papers

Year 2 Report: Protein Function Prediction Platform
C Zhou
-
C ZhouC Zhou
27 Apr 2012
27 Apr 2012

A weighted k-nearest neighbor method for gene ontology based protein function prediction
Saket Kharsikar ... Francisco Moore
-
Saket Kharsikar, et. al.Saket Kharsikar ... Francisco Moore
01 Aug 2007
01 Aug 2007

Feature Extraction in Spatially-Conserved Regions and Protein Functional Classification
Bum Ju Lee ... Dae-Sung Kim
-
Bum Ju Lee, et. al.Bum Ju Lee ... Dae-Sung Kim
01 Jan 2007
01 Jan 2007

DCGR: feature extractions from protein sequences based on CGR via remodeling multiple information
Zengchao Mu ... Guojun Li
BMC bioinformatics | VOL. 20
Zengchao Mu, et. al.Zengchao Mu ... Guojun Li
20 Jun 2019
BMC bioinformatics | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Protein function prediction as approximate semantic entailment

Abstract

Talk to us

Similar Papers

More From: Nature Machine Intelligence