A physicochemical descriptor-based scoring scheme for effective and rapid filtering of kinase-like chemical space

Narender Singh,Hongmao Sun,Mohamed Diwan M Abdulhameed,Sidhartha Chaudhury,Gregory Tawa,Anders Wallqvist

doi:10.1186/1758-2946-4-4

Abstract

BackgroundThe current chemical space of known small molecules is estimated to exceed 1060 structures. Though the largest physical compound repositories contain only a few tens of millions of unique compounds, virtual screening of databases of this size is still difficult. In recent years, the application of physicochemical descriptor-based profiling, such as Lipinski's rule-of-five for drug-likeness and Oprea's criteria of lead-likeness, as early stage filters in drug discovery has gained widespread acceptance. In the current study, we outline a kinase-likeness scoring function based on known kinase inhibitors.ResultsThe method employs a collection of 22,615 known kinase inhibitors from the ChEMBL database. A kinase-likeness score is computed using statistical analysis of nine key physicochemical descriptors for these inhibitors. Based on this score, the kinase-likeness of four publicly and commercially available databases, i.e., National Cancer Institute database (NCI), the Natural Products database (NPD), the National Institute of Health's Molecular Libraries Small Molecule Repository (MLSMR), and the World Drug Index (WDI) database, is analyzed. Three of these databases, i.e., NCI, NPD, and MLSMR are frequently used in the virtual screening of kinase inhibitors, while the fourth WDI database is for comparison since it covers a wide range of known chemical space. Based on the kinase-likeness score, a kinase-focused library is also developed and tested against three different kinase targets selected from three different branches of the human kinome tree.ConclusionsOur proposed methodology is one of the first that explores how the narrow chemical space of kinase inhibitors and its relevant physicochemical information can be utilized to build kinase-focused libraries and prioritize pre-existing compound databases for screening. We have shown that focused libraries generated by filtering compounds using the kinase-likeness score have, on average, better docking scores than an equivalent number of randomly selected compounds. Beyond library design, our findings also impact the broader efforts to identify kinase inhibitors by screening pre-existing compound libraries. Currently, the NCI library is the most commonly used database for screening kinase inhibitors. Our research suggests that other libraries, such as MLSMR, are more kinase-like and should be given priority in kinase screenings.

Highlights

The current chemical space of known small molecules is estimated to exceed 1060 structures
To test kinase-likeness of the compounds that are frequently used in virtual screening of kinase inhibitors, the following publicly available small-molecule databases were collected: the National Cancer Institute database (NCI), the Natural Products database (NPD), the National Institute of Health’s Molecular Libraries Small Molecule Repository (MLSMR)
Both the NCI and the NPD databases were downloaded from the ZINC UCSF collection [57] of compounds, while the MLSMR compounds were downloaded from the PubChem database, [58] and the World Drug Index (WDI) was obtained from Thomson Reuters [59]

Summary

Introduction

The current chemical space of known small molecules is estimated to exceed 1060 structures. Other notable collections of compounds include the Chemical Structure Lookup Service (CSLS) [7], with around 46 million unique compounds, PubChem [8] and Chemspider [9], with around 20 million compounds each, and ZINC [10] with around 13 million compounds, along with hundreds of other public or private collections ranging from a few thousands to a few millions of compounds Even though such vast collections only constitute a small fraction of possible chemical space, it is still very difficult to apply a typical biological screen to all molecules in a collection when seeking novel hits on targets of interest [11]. The focus has shifted away from screening large compound libraries to screening smaller, more targetfocused libraries that are generated using all relevant information about the target and its known active compounds [13,14,15,16,17]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Cheminformatics	Publication Date: Feb 8, 2012
Citations: 70	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

A physicochemical descriptor-based scoring scheme for effective and rapid filtering of kinase-like chemical space

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics

Lead the way for us

Similar Papers

Chemoinformatic Analysis of Combinatorial Libraries, Drugs, Natural Products, and Molecular Libraries Small Molecule Repository
Narender Singh ... Marc A Giulianotti
Journal of Chemical Information and Modeling | VOL. 49
Narender Singh, et. al.Narender Singh ... Marc A Giulianotti
20 Mar 2009
Journal of Chemical Information and Modeling | VOL. 49

Diverging DOS strategy using an allene-containing tryptophan scaffold and a library design that maximizes biologically relevant chemical space while minimizing the number of compounds.
Thomas O Painter ... Xiang-Qun Xie
ACS combinatorial science | VOL. 13
Thomas O Painter, et. al.Thomas O Painter ... Xiang-Qun Xie
18 Feb 2011
ACS combinatorial science | VOL. 13

Leveraging the Promise of Chemical Genomics
Sarah Webb
BioTechniques | VOL. 52
Sarah WebbSarah Webb
01 Jan 2012
BioTechniques | VOL. 52

Comparison of the NCI open database with seven large chemical structural databases.
Johannes H Voigt ... Shaomeng Wang
Journal of Chemical Information and Computer Sciences | VOL. 41
Johannes H Voigt, et. al.Johannes H Voigt ... Shaomeng Wang
01 May 2001
Journal of Chemical Information and Computer Sciences | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A physicochemical descriptor-based scoring scheme for effective and rapid filtering of kinase-like chemical space

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics