Abstract

Adaptor proteins, also known as signal transduction adaptor proteins, are important proteins in signal transduction pathways, and play a role in connecting signal proteins for signal transduction between cells. Studies have shown that adaptor proteins are closely related to some diseases, such as tumors and diabetes. Therefore, it is very meaningful to construct a relevant model to accurately identify adaptor proteins. In recent years, many studies have used a position-specific scoring matrix (PSSM) and neural network methods to identify adaptor proteins. However, ordinary neural network models cannot correlate the contextual information in PSSM profiles well, so these studies usually process 20×N (N>20) PSSM into 20×20 dimensions, which results in the loss of a large amount of protein information; This research proposes an efficient method that combines one-dimensional convolution (1-D CNN) and a bidirectional long short-term memory network (biLSTM) to identify adaptor proteins. The complete PSSM profiles are the input of the model, and the complete information of the protein is retained during the training process. We perform cross-validation during model training and test the performance of the model on an independent test set; in the data set with 1224 adaptor proteins and 11,078 non-adaptor proteins, five indicators including specificity, sensitivity, accuracy, area under the receiver operating characteristic curve (AUC) metric and Matthews correlation coefficient (MCC), were employed to evaluate model performance. On the independent test set, the specificity, sensitivity, accuracy and MCC were 0.817, 0.865, 0.823 and 0.465, respectively. Those results show that our method is better than the state-of-the art methods. This study is committed to improve the accuracy of adaptor protein identification, and laid a foundation for further research on diseases related to adaptor protein. This research provided a new idea for the application of deep learning related models in bioinformatics and computational biology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call