A Hierarchical Clustering Approach for Large Compound Libraries

Alexander Böcker,Swetlana Derksen,Elena Schmidt,Andreas Teckentrup,Gisbert Schneider

doi:10.1021/ci0500029

Abstract

A modified version of the k-means clustering algorithm was developed that is able to analyze large compound libraries. A distance threshold determined by plotting the sum of radii of leaf clusters was used as a termination criterion for the clustering process. Hierarchical trees were constructed that can be used to obtain an overview of the data distribution and inherent cluster structure. The approach is also applicable to ligand-based virtual screening with the aim to generate preferred screening collections or focused compound libraries. Retrospective analysis of two activity classes was performed: inhibitors of caspase 1 [interleukin 1 (IL1) cleaving enzyme, ICE] and glucocorticoid receptor ligands. The MDL Drug Data Report (MDDR) and Collection of Bioactive Reference Analogues (COBRA) databases served as the compound pool, for which binary trees were produced. Molecules were encoded by all Molecular Operating Environment 2D descriptors and topological pharmacophore atom types. Individual clusters were assessed for their purity and enrichment of actives belonging to the two ligand classes. Significant enrichment was observed in individual branches of the cluster tree. After clustering a combined database of MDDR, COBRA, and the SPECS catalog, it was possible to retrieve MDDR ICE inhibitors with new scaffolds using COBRA ICE inhibitors as seeds. A Java implementation of the clustering method is available via the Internet (http://www.modlab.de).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Hierarchical Clustering Approach for Large Compound Libraries

Abstract

Talk to us

Similar Papers

More From: Journal of Chemical Information and Modeling

Lead the way for us

Journal: Journal of Chemical Information and Modeling	Publication Date: May 3, 2005
Citations: 58

Similar Papers

Using graph-based consensus clustering for combining K-means clustering of heterogeneous chemical structures
Faisal Saeed ... Naomie Salim
Journal of Cheminformatics | VOL. 5
Faisal Saeed, et. al.Faisal Saeed ... Naomie Salim
01 Mar 2013
Journal of Cheminformatics | VOL. 5

Toward automated biochemotype annotation for large compound libraries
Xian Chen ... Jun Xu
Molecular Diversity | VOL. 10
Xian Chen, et. al.Xian Chen ... Jun Xu
01 Aug 2006
Molecular Diversity | VOL. 10

Using Molecular Equivalence Numbers to Visually Explore Structural Features that Distinguish Chemical Libraries.
Yong‐Jin Xu ... Mark Johnson
ChemInform | VOL. 33
Yong‐Jin Xu, et. al.Yong‐Jin Xu ... Mark Johnson
15 Oct 2002
ChemInform | VOL. 33

Using molecular equivalence numbers to visually explore structural features that distinguish chemical libraries.
Yong-Jin Xu ... Mark Johnson
Journal of chemical information and computer sciences | VOL. 42
Yong-Jin Xu, et. al.Yong-Jin Xu ... Mark Johnson
08 Jun 2002
Journal of chemical information and computer sciences | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Hierarchical Clustering Approach for Large Compound Libraries

Abstract

Talk to us

Similar Papers

More From: Journal of Chemical Information and Modeling