Abstract

Background As one of the largest publicly accessible databases for hosting chemical structures and biological activities, PubChem has been processing bioassay submissions from the community since 2004. With the increase in volume for the deposited data in PubChem, the diversity and wealth of information content also grows. Recently, the Tox21 program, has deposited a series of pairwise data in PubChem regarding to different mechanism of actions (MOA), such as androgen receptor (AR) agonist and antagonist datasets, to study cell toxicity. To the best of our knowledge, little work has been reported from cheminformatics study for these especially pairwise datasets, which may provide insight into the mechanism of actions of the compounds and relationship between chemical structures and functions, as well as guidance for lead compound selection and optimization. Thus, to fill the gap, we performed a comprehensive cheminformatics analysis, including scaffold analysis, matched molecular pair (MMP) analysis as well as activity cliff analysis to investigate the structural characteristics and discontinued structure–activity relationship of the individual dataset (i.e., AR agonist dataset or AR antagonist dataset) and the combined dataset (i.e., the common compounds between the AR agonist and antagonist datasets).ResultsScaffolds associated only with potential agonists or antagonists were identified. MMP-based activity cliffs, as well as a small group of compounds with dual MOA reported were recognized and analyzed. Moreover, MOA-cliff, a novel concept, was proposed to indicate one pair of structurally similar molecules which exhibit opposite MOA.ConclusionsCheminformatics methods were successfully applied to the pairwise AR datasets and the identified molecular scaffold characteristics, MMPs as well as activity cliffs might provide useful information when designing new lead compounds for the androgen receptor.Electronic supplementary materialThe online version of this article (doi:10.1186/s13321-016-0150-6) contains supplementary material, which is available to authorized users.

Highlights

  • As one of the largest publicly accessible databases for hosting chemical structures and biological activities, PubChem has been processing bioassay submissions from the community since 2004

  • Despite of a number of previous data mining efforts [3,4,5,6,7], the demand only becomes higher for researchers to collectively analyze bioactivity data to solve or provide insights into scientific questions, especially in the medicinal chemistry filed, where one of the main tasks is to identify and optimize lead compounds towards desired biological activities

  • To fill the gap, we performed a comprehensive study focusing on this data collection using several cheminformatics methods, including scaffold analysis, matched molecular pair (MMP) analysis and activity cliff analysis

Read more

Summary

Results

Scaffolds associated only with potential agonists or antagonists were identified. MMP-based activity cliffs, as well as a small group of compounds with dual MOA reported were recognized and analyzed. MOA-cliff, a novel concept, was proposed to indicate one pair of structurally similar molecules which exhibit opposite MOA

Conclusions
Background
Results and discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call