Abstract

Protein function prediction is one of the most well-studied topics, attracting attention from countless researchers in the field of computational biology. Implementing deep neural networks that help improve the prediction of protein function, however, is still a major challenge. In this research, we suggested a new strategy that includes gated recurrent units and position-specific scoring matrix profiles to predict vesicular transportation proteins, a biological function of great importance. Although it is difficult to discover its function, our model is able to achieve accuracies of 82.3% and 85.8% in the cross-validation and independent dataset, respectively. We also solve the problem of imbalance in the dataset via tuning class weight in the deep learning model. The results generated showed sensitivity, specificity, MCC, and AUC to have values of 79.2%, 82.9%, 0.52, and 0.861, respectively. Our strategy shows superiority in results on the same dataset against all other state-of-the-art algorithms. In our suggested research, we have suggested a technique for the discovery of more proteins, particularly proteins connected with vesicular transport. In addition, our accomplishment could encourage the use of gated recurrent units architecture in protein function prediction.

Highlights

  • Proteins perform a wide variety of functions within different eukaryotic cell compartments

  • We propose a novel approach to address this issue by using deep gated recurrent unit (GRU) structure, which is a form of deep neural network

  • We approached an innovative technique for discriminating vesicular transport proteins using GRU and position-specific scoring matrix (PSSM) profiles

Read more

Summary

Introduction

Proteins perform a wide variety of functions within different eukaryotic cell compartments. Prediction of protein functions is the most well-studied problems in computational biology field, attracting the attention of countless scientists. With a multitude of computational methods, much attention has been provided to enhance the predictive efficiency of protein functions. To tackle this problem, there are two popular solutions: finding the finest attribute sets and producing powerful predictive neural networks. In the past, some bioinformatics researchers used machine learning techniques with a strong feature set such as pseudo amino acid composition [1,2], position-specific scoring matrix (PSSM) [3,4], and biochemical properties [5,6]. With the rise of deep learning, many researchers in the field of

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call