Open-source platform to benchmark fingerprints for ligand-based virtual screening

Sereina Riniker,Gregory A Landrum

doi:10.1186/1758-2946-5-26

Sereina Riniker, Gregory A Landrum

Open Access

https://doi.org/10.1186/1758-2946-5-26

Copy DOI

Journal: Journal of Cheminformatics	Publication Date: May 30, 2013
Citations: 298	License type: CC BY 2.0

Affiliation: Novartis (Switzerland)

Abstract

Similarity-search methods using molecular fingerprints are an important tool for ligand-based virtual screening. A huge variety of fingerprints exist and their performance, usually assessed in retrospective benchmarking studies using data sets with known actives and known or assumed inactives, depends largely on the validation data sets used and the similarity measure used. Comparing new methods to existing ones in any systematic way is rather difficult due to the lack of standard data sets and evaluation procedures. Here, we present a standard platform for the benchmarking of 2D fingerprints. The open-source platform contains all source code, structural data for the actives and inactives used (drawn from three publicly available collections of data sets), and lists of randomly selected query molecules to be used for statistically valid comparisons of methods. This allows the exact reproduction and comparison of results for future studies. The results for 12 standard fingerprints together with two simple baseline fingerprints assessed by seven evaluation methods are shown together with the correlations between methods. High correlations were found between the 12 fingerprints and a careful statistical analysis showed that only the two baseline fingerprints were different from the others in a statistically significant way. High correlations were also found between six of the seven evaluation methods, indicating that despite their seeming differences, many of these methods are similar to each other.

Highlights

The concept of molecular similarity is often used in the context of ligand-based virtual screening (VS) to use known actives to find new molecules to test [1]
Using the benchmarking platform, the performance of 14 2D fingerprints covering dictionary-based, path-based and circular fingerprints was assessed over 88 targets from three publicly available collections of data sets
The platform uses the open-source cheminformatics toolkit RDKit to calculate fingerprints and similarities, but through the three-stage design data generated by other sources can be fed in at the validation or analysis stages

Summary

Introduction

The concept of molecular similarity is often used in the context of ligand-based virtual screening (VS) to use known actives to find new molecules to test [1]. The choice of molecular description to calculate the similarity is not trivial and can vary depending on the compound selection and/or target [5,6,7]. A variety of descriptors exist which can be divided into two large groups depending if they consider only the 2D structure (topology) of a molecule or if they include 3D information. A standard and computationally efficient abstract representation is molecular fingerprints [8], where structural features are represented by either bits in a bit string or counts in a count vector.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Open-source platform to benchmark fingerprints for ligand-based virtual screening

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics

Lead the way for us

Similar Papers

Assessing the applicability of scoring systems for predicting postoperative nausea and vomiting
J E Van Den Bosch ... W A Van Klei
Anaesthesia | VOL. 60
J E Van Den Bosch, et. al.J E Van Den Bosch ... W A Van Klei
14 Mar 2005
Anaesthesia | VOL. 60

Efficient processing of similarity search under time warping in sequence databases: an index-based approach
Sang-Wook Kim ... Wesley W Chu
Information Systems | VOL. 29
Sang-Wook Kim, et. al.Sang-Wook Kim ... Wesley W Chu
05 Jun 2003
Information Systems | VOL. 29

Using Administrative Claims Data to Estimate Virologic Failure Rates among Human Immunodeficiency Virus–Infected Patients with Antiretroviral Regimen Switches
Michael S Broder ...
Medical Decision Making | VOL. 32
Michael S Broder, et. al.Michael S Broder ...
21 Apr 2011
Medical Decision Making | VOL. 32

Predicting Survival Outcome of Localized Melanoma: An Electronic Prediction Tool Based on the AJCC Melanoma Database
Seng-Jaw Soong ... Charles M Balch
Annals of Surgical Oncology | VOL. 17
Seng-Jaw Soong, et. al.Seng-Jaw Soong ... Charles M Balch
09 Apr 2010
Annals of Surgical Oncology | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Open-source platform to benchmark fingerprints for ligand-based virtual screening

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics