Validation of a Field-Based Ligand Screener Using a Novel Benchmarking Data Set for Assessing 3D-Based Virtual Screening Methods.

Ilenia Giangreco,Abhik Mukhopadhyay,Jason C Cole

doi:10.1021/acs.jcim.1c00866

Abstract

Ligand-based methods play a crucial role in virtual screening when the 3D structure of the target is not available. This study discusses the results of a validation study of the CSD field-based ligand screener using a novel benchmarking data set containing 56 targets. The data set was created starting from the target UniProt IDs in a previously published data set (i.e., the AZ data set), by mining ChEMBL to find known active molecules for these targets and by using DUD-E to generate property-matched decoys of the identified actives. Several experiments were performed to assess the virtual screening performance of the new method. One of its strengths is that it can use an overlay of multiple flexible ligands as a query without the need to run several parallel calculations with one ligand at a time. Here, we discuss how changes to different parameter settings or adoption of different query models can influence the final performance compared to the performance when using the experimentally observed overlay of ligands. We have also generated the enrichment scores based on three external benchmark data sets to enable the comparison with existing methods previously validated using these data sets. Here, we present results for the standard DUD-E data set, the DUD-E+ data set, as well as the DUD_Lib_VS_1.0 data set which was designed for ligand-based virtual screening validation and hence is more suitable for this type of methods.

Full Text