Cracking the black box of deep sequence-based protein-protein interaction prediction.

Judith Bernett,Markus List,David B Blumenthal

doi:10.1093/bib/bbae076

Abstract

Identifying protein-protein interactions (PPIs) is crucial for deciphering biological pathways. Numerous prediction methods have been developed as cheap alternatives to biological experiments, reporting surprisingly high accuracy estimates. We systematically investigated how much reproducible deep learning models depend on data leakage, sequence similarities and node degree information, and compared them with basic machine learning models. We found that overlaps between training and test sets resulting from random splitting lead to strongly overestimated performances. In this setting, models learn solely from sequence similarities and node degrees. When data leakage is avoided by minimizing sequence similarities between training and test set, performances become random. Moreover, baseline models directly leveraging sequence similarity and network topology show good performances at a fraction of the computational cost. Thus, we advocate that any improvements should be reported relative to baseline methods in the future. Our findings suggest that predicting PPIs remains an unsolved task for proteins showing little sequence similarity to previously studied proteins, highlighting that further experimental research into the 'dark' protein interactome and better computational methods are needed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Briefings in Bioinformatics	Publication Date: Jan 22, 2024
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Cracking the black box of deep sequence-based protein-protein interaction prediction.

Abstract

Talk to us

Similar Papers

More From: Briefings in Bioinformatics

Lead the way for us

Similar Papers

Data mining in protein interactomics
J.Y Chen ... A.Y Sivachenko
IEEE Engineering in Medicine and Biology Magazine | VOL. 24
J.Y Chen, et. al.J.Y Chen ... A.Y Sivachenko
01 May 2005
IEEE Engineering in Medicine and Biology Magazine | VOL. 24

Two‐Hybrid Systems to Measure Protein–Protein Interactions
Russell L Finley ... Dumrong Mairiang
-
Russell L Finley, et. al.Russell L Finley ... Dumrong Mairiang
30 Mar 2018
30 Mar 2018

Two‐Hybrid Systems to Measure Protein–Protein Interactions
Russell L Finley ... Dumrong Mairiang
-
Russell L Finley, et. al.Russell L Finley ... Dumrong Mairiang
17 Feb 2014
17 Feb 2014

Determining protein–protein functional associations by functional rules based on gene ontology and KEGG pathway
Yu-Hang Zhang ... Yu-Dong Cai
Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics | VOL. 1869
Yu-Hang Zhang, et. al.Yu-Hang Zhang ... Yu-Dong Cai
06 Feb 2021
Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics | VOL. 1869

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cracking the black box of deep sequence-based protein-protein interaction prediction.

Abstract

Talk to us

Similar Papers

More From: Briefings in Bioinformatics