Decoding functional proteome information in model organisms using protein language models.

Israel Barrios-Núñez,Gemma I Martínez-Redondo,Patricia Medina-Burgos,Ildefonso Cases,Rosa Fernández,Ana M Rojas

doi:10.1093/nargab/lqae078

Decoding functional proteome information in model organisms using protein language models.

Israel Barrios-Núñez, Gemma I Martínez-Redondo + Show 4 more

Open Access

https://doi.org/10.1093/nargab/lqae078

Copy DOI

Journal: NAR genomics and bioinformatics	Publication Date: Jul 2, 2024
License type: CC BY 4.0

#Functional Information #Downstream Analyses + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Protein language models have been tested and proved to be reliable when used on curated datasets but have not yet been applied to full proteomes. Accordingly, we tested how two different machine learning-based methods performed when decoding functional information from the proteomes of selected model organisms. We found that protein language models are more precise and informative than deep learning methods for all the species tested and across the three gene ontologies studied, and that they better recover functional information from transcriptomicexperiments. The results obtained indicate that these language models are likely to be suitable for large-scale annotation and downstream analyses, and we recommend a guide for their use.

Full Text