Deep Neural Network Based Predictions of Protein Interactions Using Primary Sequences.

Hang Li,Hua Yu,Xiu-Jun Gong,Chang Zhou

doi:10.3390/molecules23081923

Abstract

Machine learning based predictions of protein–protein interactions (PPIs) could provide valuable insights into protein functions, disease occurrence, and therapy design on a large scale. The intensive feature engineering in most of these methods makes the prediction task more tedious and trivial. The emerging deep learning technology enabling automatic feature engineering is gaining great success in various fields. However, the over-fitting and generalization of its models are not yet well investigated in most scenarios. Here, we present a deep neural network framework (DNN-PPI) for predicting PPIs using features learned automatically only from protein primary sequences. Within the framework, the sequences of two interacting proteins are sequentially fed into the encoding, embedding, convolution neural network (CNN), and long short-term memory (LSTM) neural network layers. Then, a concatenated vector of the two outputs from the previous layer is wired as the input of the fully connected neural network. Finally, the Adam optimizer is applied to learn the network weights in a back-propagation fashion. The different types of features, including semantic associations between amino acids, position-related sequence segments (motif), and their long- and short-term dependencies, are captured in the embedding, CNN and LSTM layers, respectively. When the model was trained on Pan’s human PPI dataset, it achieved a prediction accuracy of 98.78% at the Matthew’s correlation coefficient (MCC) of 97.57%. The prediction accuracies for six external datasets ranged from 92.80% to 97.89%, making them superior to those achieved with previous methods. When performed on Escherichia coli, Drosophila, and Caenorhabditis elegans datasets, DNN-PPI obtained prediction accuracies of 95.949%, 98.389%, and 98.669%, respectively. The performances in cross-species testing among the four species above coincided in their evolutionary distances. However, when testing Mus Musculus using the models from those species, they all obtained prediction accuracies of over 92.43%, which is difficult to achieve and worthy of note for further study. These results suggest that DNN-PPI has remarkable generalization and is a promising tool for identifying protein interactions.

Highlights

Proteins often act through functions with their partners
We further investigated the capability of auto feature engineering in the deep learning framework for predicting protein–protein interactions (PPIs)
When the model was trained on Pan’s human PPI dataset, it achieved a prediction accuracy of 98.78% at the Matthew’s correlation coefficient (MCC) of 97.57%

Summary

Introduction

Proteins often act through functions with their partners. These interacting proteins regulate a variety of cellular functions, including cell-cycle progression, signal transduction, and metabolic pathways [1]. The identification of protein–protein interactions (PPIs) can provide great insight into protein functions, further biological processes, drug target detection, and even treatment design [2]. Compared to the experimental approaches, such as protein chips [3], tandem affinity. Molecules 2018, 23, 1923 purifications (TAP) [4], and other high-throughput biological techniques [5], computational methods for predicting PPIs are gaining greater exposure, as they are less labor-intensive and more efficient [6]. Machine learning approaches dominate most of the computational methods for the prediction of PPIs [7]. Building a meaningful feature set and choosing corresponding machine learning algorithms are two key steps for successful predictions in traditional machine learning

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Molecules	Publication Date: Aug 1, 2018
Citations: 107	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Deep Neural Network Based Predictions of Protein Interactions Using Primary Sequences.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecules

Lead the way for us

Similar Papers

Performance of Three Slim Variants of The Long Short-Term Memory (LSTM) Layer
Daniel Kent ... Fathi Salem
-
Daniel Kent, et. al.Daniel Kent ... Fathi Salem
01 Aug 2019
01 Aug 2019

Predicting Sentiment Polarity of Microblogs using an LSTM – CNN Deep Learning Model
Mayank Kumar Nagda* ... Poovammal E
International Journal of Engineering and Advanced Technology | VOL. 8
Mayank Kumar Nagda*, et. al.Mayank Kumar Nagda* ... Poovammal E
30 Aug 2019
International Journal of Engineering and Advanced Technology | VOL. 8

Using Machine Learning in Electrical Tomography for Building Energy Efficiency through Moisture Detection
Grzegorz Kłosowski ... Magdalena Rzemieniak
Energies | VOL. 16
Grzegorz Kłosowski, et. al.Grzegorz Kłosowski ... Magdalena Rzemieniak
11 Feb 2023
Energies | VOL. 16

A Comparative Study between CNN, LSTM, and CLDNN Models in The Context of Radio Modulation Classification
Ayman Emam ... Hossam E Abou Bakr
-
Ayman Emam, et. al.Ayman Emam ... Hossam E Abou Bakr
01 Jul 2020
01 Jul 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Neural Network Based Predictions of Protein Interactions Using Primary Sequences.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Molecules