Learning from the ligand: using ligand-based features to improve binding affinity prediction.

Fergus Boyles,Garrett M Morris,Charlotte M Deane

doi:10.1093/bioinformatics/btz665

Abstract

Machine learning scoring functions for protein-ligand binding affinity prediction have been found to consistently outperform classical scoring functions. Structure-based scoring functions for universal affinity prediction typically use features describing interactions derived from the protein-ligand complex, with limited information about the chemical or topological properties of the ligand itself. We demonstrate that the performance of machine learning scoring functions are consistently improved by the inclusion of diverse ligand-based features. For example, a Random Forest (RF) combining the features of RF-Score v3 with RDKit molecular descriptors achieved Pearson correlation coefficients of up to 0.836, 0.780 and 0.821 on the PDBbind 2007, 2013 and 2016 core sets, respectively, compared to 0.790, 0.746 and 0.814 when using the features of RF-Score v3 alone. Excluding proteins and/or ligands that are similar to those in the test sets from the training set has a significant effect on scoring function performance, but does not remove the predictive power of ligand-based features. Furthermore a RF using only ligand-based features is predictive at a level similar to classical scoring functions and it appears to be predicting the mean binding affinity of a ligand for its protein targets. Data and code to reproduce all the results are freely available at http://opig.stats.ox.ac.uk/resources. Supplementary data are available at Bioinformatics online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning from the ligand: using ligand-based features to improve binding affinity prediction.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Journal: Bioinformatics	Publication Date: Aug 26, 2019
Citations: 71

Similar Papers

Classical scoring functions for docking are unable to exploit large volumes of structural and interaction data.
Hongjian Li ... Man-Hon Wong
Bioinformatics | VOL. 35
Hongjian Li, et. al.Hongjian Li ... Man-Hon Wong
14 Mar 2019
Bioinformatics | VOL. 35

Comparative assessment of machine-learning scoring functions on PDBbind 2013
Mohamed A Khamis ... Walid Gomaa
Engineering Applications of Artificial Intelligence | VOL. 45
Mohamed A Khamis, et. al.Mohamed A Khamis ... Walid Gomaa
16 Jul 2015
Comparative assessment of machine-learning scoring functions on PDBbind 2013
Mohamed A Khamis ... Walid Gomaa

Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: Cyscore as a case study.
Hongjian Li ... Kwong-Sak Leung
BMC Bioinformatics | VOL. 15
Hongjian Li, et. al.Hongjian Li ... Kwong-Sak Leung
27 Aug 2014
BMC Bioinformatics | VOL. 15

Learning protein-ligand binding affinity with atomic environment vectors
Rocco Meli ... Mike J Bodkin
Journal of Cheminformatics | VOL. 13
Rocco Meli, et. al.Rocco Meli ... Mike J Bodkin
14 Aug 2021
Journal of Cheminformatics | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning from the ligand: using ligand-based features to improve binding affinity prediction.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics