Comparative Assessment Of Scoring Functions Research Articles

In structure-based drug design, scoring functions are widely used for fast evaluation of protein-ligand interactions. They are often applied in combination with molecular docking and de novo design methods. Since the early 1990s, a whole spectrum of protein-ligand interaction scoring functions have been developed. Regardless of their technical difference, scoring functions all need data sets combining protein-ligand complex structures and binding affinity data for parametrization and validation. However, data sets of this kind used to be rather limited in terms of size and quality. On the other hand, standard metrics for evaluating scoring function used to be ambiguous. Scoring functions are often tested in molecular docking or even virtual screening trials, which do not directly reflect the genuine quality of scoring functions. Collectively, these underlying obstacles have impeded the invention of more advanced scoring functions. In this Account, we describe our long-lasting efforts to overcome these obstacles, which involve two related projects. On the first project, we have created the PDBbind database. It is the first database that systematically annotates the protein-ligand complexes in the Protein Data Bank (PDB) with experimental binding data. This database has been updated annually since its first public release in 2004. The latest release (version 2016) provides binding data for 16 179 biomolecular complexes in PDB. Data sets provided by PDBbind have been applied to many computational and statistical studies on protein-ligand interaction and various subjects. In particular, it has become a major data resource for scoring function development. On the second project, we have established the Comparative Assessment of Scoring Functions (CASF) benchmark for scoring function evaluation. Our key idea is to decouple the "scoring" process from the "sampling" process, so scoring functions can be tested in a relatively pure context to reflect their quality. In our latest work on this track, i.e. CASF-2013, the performance of a scoring function was quantified in four aspects, including "scoring power", "ranking power", "docking power", and "screening power". All four performance tests were conducted on a test set containing 195 high-quality protein-ligand complexes selected from PDBbind. A panel of 20 standard scoring functions were tested as demonstration. Importantly, CASF is designed to be an open-access benchmark, with which scoring functions developed by different researchers can be compared on the same grounds. Indeed, it has become a popular choice for scoring function validation in recent years. Despite the considerable progress that has been made so far, the performance of today's scoring functions still does not meet people's expectations in many aspects. There is a constant demand for more advanced scoring functions. Our efforts have helped to overcome some obstacles underlying scoring function development so that the researchers in this field can move forward faster. We will continue to improve the PDBbind database and the CASF benchmark in the future to keep them as useful community resources.

Read full abstract

Computational docking is the core process of computer-aided drug design (CADD); it aims at predicting the best orientation and conformation of a small molecule (drug ligand) when bound to a target large receptor molecule (protein) in order to form a stable complex molecule. The docking quality is typically measured by a scoring function: a mathematical predictive model that produces a score representing the binding free energy and hence the stability of the resulting complex molecule. An effective scoring function should produce promising drug candidates which can then be synthesized and physically screened using high throughput screening (HTS) process. Therefore, the key to CADD is the design of an efficient highly accurate scoring function. Many traditional techniques have been proposed, however, the performance was generally poor. Only in the last few years the application of the machine learning (ML) technology has been applied in the design of scoring functions; and the results have been very promising.In this paper, we propose 12 scoring functions based on a wide range of ML techniques. We analyze the performance of each on the scoring power (binding affinity prediction), ranking power (relative ranking prediction), docking power (identifying the native binding poses among computer-generated decoys), and screening power (classifying true binders versus negative binders) using the PDBbind 2013 database. We compare our results with the recently published comparative assessment of scoring functions (CASF-2013) of 20 classical scoring functions most of which are implemented in main-stream commercial software. For the scoring and ranking powers, the proposed ML scoring functions depend on a wide range of features (energy terms, pharmacophore, geometrical) that entirely characterize the protein–ligand complexes (about 108 features); these features are extracted from several docking software available in the literature; a feature-space reduction technique, namely, principal component analysis is then applied and the performance is studied accordingly. For the docking and screening powers, the proposed ML scoring functions depend on the geometrical features of the RF-Score (36 features) to utilize a larger number of training complexes (relative to the large number of decoys in the testing set). For the scoring power, the best ML scoring function (RF) achieves a Pearson correlation coefficient between the predicted and experimentally determined binding affinities of 0.704 versus 0.614 achieved by the best classical scoring function (X-ScoreHM). For the ranking power, the best ML scoring function (RF) achieves a Spearman correlation coefficient between the ranks based on the predicted and experimentally determined binding affinities of 0.697 versus 0.626 achieved by the best classical scoring function (X-ScoreHM). For the docking power, the best ML scoring function (BRT) has a success rate in identifying the top best-scored ligand binding pose within 2Å root-mean-square deviation from the native pose of 13.8% versus 81.0% achieved by the best classical scoring function (ChemPLP@GOLD). For the screening power, the best ML scoring function (SVM) has an average enrichment factor and success rate at the top 1% level of 3.76 and 6.45% versus 19.54 and 60% respectively achieved by the best classical scoring function (GlideScore-SP).

Read full abstract

Comparative Assessment Of Scoring Functions Research Articles

Articles published on Comparative Assessment Of Scoring Functions

Protein-Ligand Binding Affinity Prediction Exploiting Sequence Constituent Homology.

Heterogeneous graph convolutional neural network for protein-ligand scoring

XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein-Ligand Scoring and Ranking.

Assessing protein-ligand interaction scoring functions with the CASF-2013 benchmark.

Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions.

Incorporating specificity into optimization: evaluation of SPA using CSAR 2014 and CASF 2013 benchmarks.

Comparative assessment of machine-learning scoring functions on PDBbind 2013

Comparative Assessment of Scoring Functions on an Updated Benchmark: 1. Compilation of the Test Set

Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Comparative Assessment Of Scoring Functions Research Articles

Articles published on Comparative Assessment Of Scoring Functions

Protein-Ligand Binding Affinity Prediction Exploiting Sequence Constituent Homology.

Heterogeneous graph convolutional neural network for protein-ligand scoring

XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein-Ligand Scoring and Ranking.

Assessing protein-ligand interaction scoring functions with the CASF-2013 benchmark.

Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions.

Incorporating specificity into optimization: evaluation of SPA using CSAR 2014 and CASF 2013 benchmarks.

Comparative assessment of machine-learning scoring functions on PDBbind 2013

Comparative Assessment of Scoring Functions on an Updated Benchmark: 1. Compilation of the Test Set

Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results