GOProFormer: A Multi-Modal Transformer Method for Gene Ontology Protein Function Prediction.

Anowarul Kabir,Amarda Shehu

doi:10.3390/biom12111709

Abstract

Protein Language Models (PLMs) are shown to be capable of learning sequence representations useful for various prediction tasks, from subcellular localization, evolutionary relationships, family membership, and more. They have yet to be demonstrated useful for protein function prediction. In particular, the problem of automatic annotation of proteins under the Gene Ontology (GO) framework remains open. This paper makes two key contributions. It debuts a novel method that leverages the transformer architecture in two ways. A sequence transformer encodes protein sequences in a task-agnostic feature space. A graph transformer learns a representation of GO terms while respecting their hierarchical relationships. The learned sequence and GO terms representations are combined and utilized for multi-label classification, with the labels corresponding to GO terms. The method is shown superior over recent representative GO prediction methods. The second major contribution in this paper is a deep investigation of different ways of constructing training and testing datasets. The paper shows that existing approaches under- or over-estimate the generalization power of a model. A novel approach is proposed to address these issues, resulting in a new benchmark dataset to rigorously evaluate and compare methods and advance the state-of-the-art.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Biomolecules	Publication Date: Nov 18, 2022
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

GOProFormer: A Multi-Modal Transformer Method for Gene Ontology Protein Function Prediction.

Abstract

Talk to us

Similar Papers

More From: Biomolecules

Lead the way for us

Similar Papers

Identification and Analysis of Single- and Multiple-Region Mitotic Protein Complexes by Grouping Gene Ontology Terms
Wen Lin Huang ... Shinn Ying Ho
Applied Mechanics and Materials | VOL. 421
Wen Lin Huang, et. al.Wen Lin Huang ... Shinn Ying Ho
11 Sep 2013
Applied Mechanics and Materials | VOL. 421

Answering Gene Ontology terms to proteomics questions by supervised macro reading in Medline
Julien Gobeill ... Douglas Teodoro
EMBnet.journal | VOL. 18
Julien Gobeill, et. al.Julien Gobeill ... Douglas Teodoro
09 Nov 2012
EMBnet.journal | VOL. 18

Global analysis of gene function in yeast by quantitative phenotypic profiling
James A Brown ... Nicola M Burrows
Molecular Systems Biology | VOL. 2
James A Brown, et. al.James A Brown ... Nicola M Burrows
01 Jan 2006
Molecular Systems Biology | VOL. 2

Ranking Gene Ontology terms for predicting non-classical secretory proteins in eukaryotes and prokaryotes
Wen-Lin Huang
Journal of Theoretical Biology | VOL. 312
Wen-Lin HuangWen-Lin Huang
07 Aug 2012
Journal of Theoretical Biology | VOL. 312

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GOProFormer: A Multi-Modal Transformer Method for Gene Ontology Protein Function Prediction.

Abstract

Talk to us

Similar Papers

More From: Biomolecules