PubChem3D: Similar conformers

Evan E Bolton,Sunghwan Kim,Stephen H Bryant

doi:10.1186/1758-2946-3-13

Abstract

BackgroundPubChem is a free and open public resource for the biological activities of small molecules. With many tens of millions of both chemical structures and biological test results, PubChem is a sizeable system with an uneven degree of available information. Some chemical structures in PubChem include a great deal of biological annotation, while others have little to none. To help users, PubChem pre-computes "neighboring" relationships to relate similar chemical structures, which may have similar biological function. In this work, we introduce a "Similar Conformers" neighboring relationship to identify compounds with similar 3-D shape and similar 3-D orientation of functional groups typically used to define pharmacophore features.ResultsThe first two diverse 3-D conformers of 26.1 million PubChem Compound records were compared to each other, using a shape Tanimoto (ST) of 0.8 or greater and a color Tanimoto (CT) of 0.5 or greater, yielding 8.16 billion conformer neighbor pairs and 6.62 billion compound neighbor pairs, with an average of 253 "Similar Conformers" compound neighbors per compound. Comparing the 3-D neighboring relationship to the corresponding 2-D neighboring relationship ("Similar Compounds") for molecules such as caffeine, aspirin, and morphine, one finds unique sets of related chemical structures, providing additional significant biological annotation. The PubChem 3-D neighboring relationship is also shown to be able to group a set of non-steroidal anti-inflammatory drugs (NSAIDs), despite limited PubChem 2-D similarity.In a study of 4,218 chemical structures of biomedical interest, consisting of many known drugs, using more diverse conformers per compound results in more 3-D compound neighbors per compound; however, the overlap of the compound neighbor lists per conformer also increasingly resemble each other, being 38% identical at three conformers and 68% at ten conformers. Perhaps surprising is that the average count of conformer neighbors per conformer increases rather slowly as a function of diverse conformers considered, with only a 70% increase for a ten times growth in conformers per compound (a 68-fold increase in the conformer pairs considered).Neighboring 3-D conformers on the scale performed, if implemented naively, is an intractable problem using a modest sized compute cluster. Methodology developed in this work relies on a series of filters to prevent performing 3-D superposition optimization, when it can be determined that two conformers cannot possibly be a neighbor. Most filters are based on Tanimoto equation volume constraints, avoiding incompatible conformers; however, others consider preliminary superposition between conformers using reference shapes.ConclusionThe "Similar Conformers" 3-D neighboring relationship locates similar small molecules of biological interest that may go unnoticed when using traditional 2-D chemical structure graph-based methods, making it complementary to such methodologies. The computational cost of 3-D similarity methodology on a wide scale, such as PubChem contents, is a considerable issue to overcome. Using a series of efficient filters, an effective throughput rate of more than 150,000 conformers per second per processor core was achieved, more than two orders of magnitude faster than without filtering.

Highlights

PubChem is a free and open public resource for the biological activities of small molecules
We describe the multi-conformer PubChem “Similar Conformers” 3-D neighboring relationship and explain various strategies and approaches that made it a tractable problem, including extending the “alignment recycling” methodology to cover the full range of chemical structures considered in the PubChem3D project
In the present paper, the PubChem 3-D “Similar Conformers” neighboring relationship and the methodology used in its computation are described

Summary

Introduction

PubChem is a free and open public resource for the biological activities of small molecules. With many tens of millions of both chemical structures and biological test results, PubChem is a sizeable system with an uneven degree of available information. PubChem pre-computes “neighboring” relationships to relate similar chemical structures, which may have similar biological function. Some chemical structures in PubChem have a great deal of biological annotation and literature associated, while many others (e.g., synthesized for high-throughput screening purposes) have little to nothing known about them other than the chemical structure. To help overcome this disparity, PubChem helps users to locate or relate data in the archive by pre-computing “neighboring” relationships. Known as “Similar Compounds”, associates a pair of chemical structures if they have a Tanimoto [5,6,7] similarity of 0.9 or greater when using the PubChem subgraph binary fingerprint [8] and Eq (1).

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Cheminformatics	Publication Date: May 9, 2011
Citations: 48	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

PubChem3D: Similar conformers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics

Lead the way for us

Similar Papers

PubChem3D: Shape compatibility filtering using molecular shape quadrupoles
Sunghwan Kim ... Stephen H Bryant
Journal of Cheminformatics | VOL. 3
Sunghwan Kim, et. al.Sunghwan Kim ... Stephen H Bryant
20 Jul 2011
Journal of Cheminformatics | VOL. 3

PubChem3D: a new resource for scientists.
Evan E Bolton ... Stephen H Bryant
Journal of Cheminformatics | VOL. 3
Evan E Bolton, et. al.Evan E Bolton ... Stephen H Bryant
20 Sep 2011
Journal of Cheminformatics | VOL. 3

Similar compounds versus similar conformers: complementarity between PubChem 2-D and 3-D neighboring sets.
Sunghwan Kim ... Stephen H Bryant
Journal of Cheminformatics | VOL. 8
Sunghwan Kim, et. al.Sunghwan Kim ... Stephen H Bryant
04 Nov 2016
Journal of Cheminformatics | VOL. 8

PubChem3D: Diversity of shape
Evan E Bolton ... Sunghwan Kim
Journal of Cheminformatics | VOL. 3
Evan E Bolton, et. al.Evan E Bolton ... Sunghwan Kim
21 Mar 2011
Journal of Cheminformatics | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PubChem3D: Similar conformers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics