Abstract

BackgroundRecurrent neural network(RNN) is a good way to process sequential data, but the capability of RNN to compute long sequence data is inefficient. As a variant of RNN, long short term memory(LSTM) solved the problem in some extent. Here we improved LSTM for big data application in protein-protein interaction interface residue pairs prediction based on the following two reasons. On the one hand, there are some deficiencies in LSTM, such as shallow layers, gradient explosion or vanishing, etc. With a dramatic data increasing, the imbalance between algorithm innovation and big data processing has been more serious and urgent. On the other hand, protein-protein interaction interface residue pairs prediction is an important problem in biology, but the low prediction accuracy compels us to propose new computational methods.ResultsIn order to surmount aforementioned problems of LSTM, we adopt the residual architecture and add attention mechanism to LSTM. In detail, we redefine the block, and add a connection from front to back in every two layers and attention mechanism to strengthen the capability of mining information. Then we use it to predict protein-protein interaction interface residue pairs, and acquire a quite good accuracy over 72%. What’s more, we compare our method with random experiments, PPiPP, standard LSTM, and some other machine learning methods. Our method shows better performance than the methods mentioned above.ConclusionWe present an attention mechanism enhanced LSTM with residual architecture, and make deeper network without gradient vanishing or explosion to a certain extent. Then we apply it to a significant problem– protein-protein interaction interface residue pairs prediction and obtain a better accuracy than other methods. Our method provides a new approach for protein-protein interaction computation, which will be helpful for related biomedical researches.

Highlights

  • Recurrent neural network(RNN) is a good way to process sequential data, but the capability of RNN to compute long sequence data is inefficient

  • Recurrent neural network(RNN), proposed by Hochreiter, is a major neural network in deep learning, which does as a bridge to connect the the information from past to present

  • From the perspective of homology, we find that 1GL1 has a little higher homology 16.7% with 2I9B

Read more

Summary

Introduction

Recurrent neural network(RNN) is a good way to process sequential data, but the capability of RNN to compute long sequence data is inefficient. Protein-protein interaction interface residue pairs prediction is an important problem in biology, but the low prediction accuracy compels us to propose new computational methods. Recurrent neural network(RNN), proposed by Hochreiter, is a major neural network in deep learning, which does as a bridge to connect the the information from past to present. It is based on the back propagation algorithm and contains the factor caused by time, RNN is a kind of back propagation through time(BPTT) algorithm. Here we’ll write it briefly. x denotes the input vector value, xti denotes the value of input ith of vector x at time t, and wij denotes the weight

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call