Predicting protein-protein interactions using high-quality non-interacting pairs

Long Zhang,Jun Wang,Guoxian Yu,Maozu Guo

doi:10.1186/s12859-018-2525-3

Abstract

BackgroundIdentifying protein-protein interactions (PPIs) is of paramount importance for understanding cellular processes. Machine learning-based approaches have been developed to predict PPIs, but the effectiveness of these approaches is unsatisfactory. One major reason is that they randomly choose non-interacting protein pairs (negative samples) or heuristically select non-interacting pairs with low quality.ResultsTo boost the effectiveness of predicting PPIs, we propose two novel approaches (NIP-SS and NIP-RW) to generate high quality non-interacting pairs based on sequence similarity and random walk, respectively. Specifically, the known PPIs collected from public databases are used to generate the positive samples. NIP-SS then selects the top-m dissimilar protein pairs as negative examples and controls the degree distribution of selected proteins to construct the negative dataset. NIP-RW performs random walk on the PPI network to update the adjacency matrix of the network, and then selects protein pairs not connected in the updated network as negative samples. Next, we use auto covariance (AC) descriptor to encode the feature information of amino acid sequences. After that, we employ deep neural networks (DNNs) to predict PPIs based on extracted features, positive and negative examples. Extensive experiments show that NIP-SS and NIP-RW can generate negative samples with higher quality than existing strategies and thus enable more accurate prediction.ConclusionsThe experimental results prove that negative datasets constructed by NIP-SS and NIP-RW can reduce the bias and have good generalization ability. NIP-SS and NIP-RW can be used as a plugin to boost the effectiveness of PPIs prediction. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NIP.

Highlights

Identifying protein-protein interactions (PPIs) is of paramount importance for understanding cellular processes
Twelve PPIs datasets are obtained. Another six datasets were collected as the independent test datasets to further assess the generalization ability of NIP-SS and NIP-RW, Mammalian dataset collected from Negatome 2.0 [33] only contains non-interacting protein pairs, they were generated by manual curation of literature
Each of these three groups contains four training sets and the difference between these four sets is the negative samples, which are generated by NIP-SS, NIP-RW, subcellular location, and random pairing

Summary

Results

To boost the effectiveness of predicting PPIs, we propose two novel approaches (NIP-SS and NIP-RW) to generate high quality non-interacting pairs based on sequence similarity and random walk, respectively. The known PPIs collected from public databases are used to generate the positive samples. NIP-SS selects the top-m dissimilar protein pairs as negative examples and controls the degree distribution of selected proteins to construct the negative dataset. NIP-RW performs random walk on the PPI network to update the adjacency matrix of the network, and selects protein pairs not connected in the updated network as negative samples. We employ deep neural networks (DNNs) to predict PPIs based on extracted features, positive and negative examples. Extensive experiments show that NIP-SS and NIP-RW can generate negative samples with higher quality than existing strategies and enable more accurate prediction

Conclusions

Background

Methods

Results and discussion

Method

Conclusion and future work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Dec 1, 2018
Citations: 24	License type: open-access

R Discovery Prime

R Discovery Prime

Predicting protein-protein interactions using high-quality non-interacting pairs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Protein co-evolution, co-adaptation and interactions
Florencio Pazos ... Alfonso Valencia
The EMBO Journal | VOL. 27
Florencio Pazos, et. al.Florencio Pazos ... Alfonso Valencia
25 Sep 2008
The EMBO Journal | VOL. 27

Flaws in evaluation schemes for pair-input computational predictions
Yungki Park ... Edward M Marcotte
Nature Methods | VOL. 9
Yungki Park, et. al.Yungki Park ... Edward M Marcotte
01 Dec 2012
Nature Methods | VOL. 9

Non-interacting proteins may resemble interacting proteins: prevalence and implications
Guillaume Launay ... Juliette Martin
Scientific Reports | VOL. 7
Guillaume Launay, et. al.Guillaume Launay ... Juliette Martin
13 Jan 2017
Scientific Reports | VOL. 7

Predicting Protein-Protein Interactions from Multimodal Biological Data Sources via Nonnegative Matrix Tri-Factorization
Hua Wang ... Feiping Nie
-
Hua Wang, et. al.Hua Wang ... Feiping Nie
01 Jan 2012
01 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting protein-protein interactions using high-quality non-interacting pairs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics