Abstract

BackgroundSelf-interacting Proteins (SIPs) plays a critical role in a series of life function in most living cells. Researches on SIPs are important part of molecular biology. Although numerous SIPs data be provided, traditional experimental methods are labor-intensive, time-consuming and costly and can only yield limited results in real-world needs. Hence,it’s urgent to develop an efficient computational SIPs prediction method to fill the gap. Deep learning technologies have proven to produce subversive performance improvements in many areas, but the effectiveness of deep learning methods for SIPs prediction has not been verified.ResultsWe developed a deep learning model for predicting SIPs by constructing a Stacked Long Short-Term Memory (SLSTM) neural network that contains “dropout”. We extracted features from protein sequences using a novel feature extraction scheme that combined Zernike Moments (ZMs) with Position Specific Weight Matrix (PSWM). The capability of the proposed approach was assessed on S.erevisiae and Human SIPs datasets. The result indicates that the approach based on deep learning can effectively resist data skew and achieve good accuracies of 95.69 and 97.88%, respectively. To demonstrate the progressiveness of deep learning, we compared the results of the SLSTM-based method and the celebrated Support Vector Machine (SVM) method and several other well-known methods on the same datasets.ConclusionThe results show that our method is overall superior to any of the other existing state-of-the-art techniques. As far as we know, this study first applies deep learning method to predict SIPs, and practical experimental results reveal its potential in SIPs identification.

Highlights

  • Self-interacting Proteins (SIPs) plays a critical role in a series of life function in most living cells

  • Performance evaluation In order to evaluate the methods presented in this paper, we used a few commonly used indicators: The accuracy (ACC), true positive rate (TPR), positive predictive value (PPV), specificity (SPC), and Matthew’s Correlation Coefficient (MCC)

  • Similar situations appear on the Human data set, the performance of the Zernike Moments (ZMs)-Stacked Long Short-Term Memory (SLSTM) method has been found to be better with 97.88% ACC, 88.00% TPR, 98.70% SPC, 84.93% PPV, 85.60% MCC and 0.9908 AUC versus 95.30% ACC, 54.26% TPR, 99.01% SPC, 83.27% PPV, 66.07% MCC and 0.9261 AUC, respectively

Read more

Summary

Introduction

Self-interacting Proteins (SIPs) plays a critical role in a series of life function in most living cells. Researches on SIPs are important part of molecular biology. It’s urgent to develop an efficient computational SIPs prediction method to fill the gap. Deep learning technologies have proven to produce subversive performance improvements in many areas, but the effectiveness of deep learning methods for SIPs prediction has not been verified. As the embodiment of life activity, protein does not exist in isolation, but through interaction to complete most of the process in the cell. Protein-protein interaction (PPIs) has been the focus of the study of biological processes. SIPs are considered to be a unique protein interaction. SIPs have the same arrangement of amino acids. This leads to the formation of homodimer.

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call