Implementation of machine learning models to determine the appropriate model for protein function prediction

Yekaterina Golenko,Anargul Shaushenova,Zhazira Mutalova,Aisulu Ismailova,Akgul Naizagarayeva,Damir Dossalyanov,Aliya Ainagulova

doi:10.15587/1729-4061.2022.263270

Abstract

Predicting the function of proteins is a crucial part of genome annotation, which can help in solving a wide range of biological problems. Many methods are available to predict the functions of proteins. However, except for sequence, most features are difficult to obtain or are not available for many proteins, which limits their scope. In addition, the performance of sequence-based feature prediction methods is often lower than that of methods that involve multiple features, and protein feature prediction can be time-consuming. Recent advances in this field are associated with the development of machine learning, which shows great progress in solving the problem of predicting protein functions. Today, however, most protein sequences have the status of «uncharacterized» or «putative». The need to assess the accuracy of identification of protein functions is an urgent task for machine learning approaches used to predict protein functions. In this study, the performance of two popular function prediction algorithms (ProtCNN and BiLSTM) was assessed from two perspectives and the procedures for building these models were described. As a result of the study of Pfam families, ProtCNN achieves an accuracy rate of 0.988 % and bidirectional LSTM has an accuracy rate of 0.9506 %. The use of the Pfam dataset allowed increasing the classification accuracy due to the large training dataset. The quality of the prediction increases with a large amount of training data. The study demonstrated that machine learning algorithms can be used as an effective tool for building protein function prediction models, in particular, the CNN network can be adapted as an accurate tool for annotating protein functions in the presence of large datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Eastern-European Journal of Enterprise Technologies	Publication Date: Oct 30, 2022
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Implementation of machine learning models to determine the appropriate model for protein function prediction

Abstract

Talk to us

Similar Papers

More From: Eastern-European Journal of Enterprise Technologies

Lead the way for us

Similar Papers

Year 2 Report: Protein Function Prediction Platform
C Zhou
-
C ZhouC Zhou
27 Apr 2012
27 Apr 2012

Amino acid features: a missing compartment of prediction of protein function
Esmaeil Ebrahimie ... Mahdi Ebrahimi
Nature Precedings | VOL. 6
Esmaeil Ebrahimie, et. al.Esmaeil Ebrahimie ... Mahdi Ebrahimi
13 Dec 2011
Nature Precedings | VOL. 6

Improving protein function prediction by learning and integrating representations of protein sequences and function labels.
Frimpong Boadu ... Jianlin Cheng
Bioinformatics advances | VOL. 4
Frimpong Boadu, et. al.Frimpong Boadu ... Jianlin Cheng
17 Aug 2024
Bioinformatics advances | VOL. 4

Identification of protein functions using a machine-learning approach based on sequence-derived properties
Bum Ju Lee ... Young Joon Oh
Proteome Science | VOL. 7
Bum Ju Lee, et. al.Bum Ju Lee ... Young Joon Oh
09 Aug 2009
Proteome Science | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Implementation of machine learning models to determine the appropriate model for protein function prediction

Abstract

Talk to us

Similar Papers

More From: Eastern-European Journal of Enterprise Technologies