Comparative evaluation of methods for the prediction of protein-ligand binding sites.

Javier S Utgés,Geoffrey J Barton

doi:10.1186/s13321-024-00923-z

Abstract

The accurate identification of protein-ligand binding sites is of critical importance in understanding and modulating protein function. Accordingly, ligand binding site prediction has remained a research focus for over three decades with over 50 methods developed and a change of paradigm from geometry-based to machine learning. In this work, we collate 13 ligand binding site predictors, spanning 30years, focusing on the latest machine learning-based methods such as VN-EGNN, IF-SitePred, GrASP, PUResNet, and DeepPocket and compare them to the established P2Rank, PRANK and fpocket and earlier methods like PocketFinder, Ligsite and Surfnet. We benchmark the methods against the human subset of our new curated reference dataset, LIGYSIS. LIGYSIS is a comprehensive protein-ligand complex dataset comprising 30,000 proteins with bound ligands which aggregates biologically relevant unique protein-ligand interfaces across biological units of multiple structures from the same protein. LIGYSIS is an improvement for testing methods over earlier datasets like sc-PDB, PDBbind, binding MOAD, COACH420 and HOLO4K which either include 1:1 protein-ligand complexes or consider asymmetric units. Re-scoring of fpocket predictions by PRANK and DeepPocket display the highest recall (60%) whilst IF-SitePred presents the lowest recall (39%). We demonstrate the detrimental effect that redundant prediction of binding sites has on performance as well as the beneficial impact of stronger pocket scoring schemes, with improvements up to 14% in recall (IF-SitePred) and 30% in precision (Surfnet). Finally, we propose top-N+2 recall as the universal benchmark metric for ligand binding site prediction and urge authors to share not only the source code of their methods, but also of their benchmark.Scientific contributionsThis study conducts the largest benchmark of ligand binding site prediction methods to date, comparing 13 original methods and 15 variants using 10 informative metrics. The LIGYSIS dataset is introduced, which aggregates biologically relevant protein-ligand interfaces across multiple structures of the same protein. The study highlights the detrimental effect of redundant binding site prediction and demonstrates significant improvement in recall and precision through stronger scoring schemes. Finally, top-N+2 recall is proposed as a universal benchmark metric for ligand binding site prediction, with a recommendation for open-source sharing of both methods and benchmarks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparative evaluation of methods for the prediction of protein-ligand binding sites.

Abstract

Talk to us

Similar Papers

More From: Journal of cheminformatics

Lead the way for us

Journal: Journal of cheminformatics	Publication Date: Nov 11, 2024
License type: CC BY 4.0

Similar Papers

Pocketome via Comprehensive Identification and Classification of Ligand Binding Envelopes
Jianghong An ... Ruben Abagyan
Molecular & Cellular Proteomics | VOL. 4
Jianghong An, et. al.Jianghong An ... Ruben Abagyan
01 Jun 2005
Molecular & Cellular Proteomics | VOL. 4

Proteins and Their Interacting Partners: An Introduction to Protein–Ligand Binding Site Prediction Methods with a Focus on FunFOLD3
Danielle Allison Brackenridge ... Liam James Mcguffin
-
Danielle Allison Brackenridge, et. al.Danielle Allison Brackenridge ... Liam James Mcguffin
01 Jan 2020
01 Jan 2020

Methods for the Prediction of Protein-Ligand Binding Sites for Structure-Based Drug Design and Virtual Ligand Screening.
Bentham Science Publisher Bentham Science Publisher
Current Protein & Peptide Science | VOL. 7
Bentham Science Publisher Bentham Science PublisherBentham Science Publisher Bentham Science Publisher
01 Oct 2006
Current Protein & Peptide Science | VOL. 7

PUResNetV2.0: a deep learning model leveraging sparse representation for improved ligand binding site prediction
Kandel Jeevan ... Kil T Chong
Journal of Cheminformatics | VOL. 16
Kandel Jeevan, et. al.Kandel Jeevan ... Kil T Chong
07 Jun 2024
Journal of Cheminformatics | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative evaluation of methods for the prediction of protein-ligand binding sites.

Abstract

Talk to us

Similar Papers

More From: Journal of cheminformatics