Large-scale bioactivity analysis of the small-molecule assayed proteome.

Tyler William H Backman,Daniel S Evans,Thomas Girke

doi:10.1371/journal.pone.0171413

Tyler William H Backman, Daniel S Evans + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0171413

Copy DOI

Abstract

This study presents an analysis of the small molecule bioactivity profiles across large quantities of diverse protein families represented in PubChem BioAssay. We compared the bioactivity profiles of FDA approved drugs to non-FDA approved compounds, and report several distinct patterns characteristic of the approved drugs. We found that a large fraction of the previously reported higher target promiscuity among FDA approved compounds, compared to non-FDA approved bioactives, was frequently due to cross-reactivity within rather than across protein families. We identified 804 potentially novel protein target candidates for FDA approved drugs, as well as 901 potentially novel target candidates with active non-FDA approved compounds, but no FDA approved drugs with activity against these targets. We also identified 486348 potentially novel compounds active against the same targets as FDA approved drugs, as well as 153402 potentially novel compounds active against targets without active FDA approved drugs. By quantifying the agreement among replicated screens, we estimated that more than half of these novel outcomes are reproducible. Using biclustering, we identified many dense clusters of FDA approved drugs with enriched activity against a common set of protein targets. We also report the distribution of compound promiscuity using a Bayesian statistical model, and report the sensitivity and specificity of two common methods for identifying promiscuous compounds. Aggregator assays exhibited greater accuracy in identifying highly promiscuous compounds, while PAINS substructures were able to identify a much larger set of “middle range” promiscuous compounds. Additionally, we report a large number of promiscuous compounds not identified as aggregators or PAINS. In summary, the results of this study represent a rich reference for selecting novel drug and target protein candidates, as well as for eliminating candidate compounds with unselective activities.

Highlights

High throughput screening (HTS) is a key technology for identifying bioactive small molecules for chemical genomics and drug discovery applications
We used the RDKit software library SMARTS based Pan-assay interference compounds (PAINS) filters to identify compounds classified by the PAINS filters A, B, or C. These SMARTS filters are based on the SMARTS conversion published by Saubern et al based on the SLN format filters originally published by Baell et al [28, 49] This identified 19988 PAINS compounds, and 298166 nonPAINS compounds, among the set of highly screened actives in PubChem BioAssay. 68 of the compounds we identified as PAINS are FDA approved drugs
By systematically analyzing a large volume of public bioactivity data, we highlight several new patterns of bioactivity that may prove useful for informing drug discovery efforts

Summary

Introduction

High throughput screening (HTS) is a key technology for identifying bioactive small molecules for chemical genomics and drug discovery applications. At the time of writing, the PubChem BioAssay database contains just over 230 million small molecule bioactivity outcomes, over half of which involve activity against a clearly defined protein target [3] It includes most of the bioactivity data available in the public domain as it imports assays from many sources such as ChEMBL, and provides negative (inactive) assay outcomes not reported in many databases [4]. To investigate why FDA approved drugs on average exhibit activity against a greater number of targets than non-FDA compounds, we computed the target selectivity of small molecules against protein clusters obtained with three distinct methods that classify protein sequences across increasingly large evolutionary distances. To investigate the frequency of highly promiscuous compounds, we used a statistical model to infer the hit ratio of each compound, and report 1157 likely-promiscuous compounds not previously identified by two common methods of identifying promiscuous compounds, aggregator assays and PAINS substructures [12, 28]

Results and discussion

Methods

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: Feb 8, 2017
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Large-scale bioactivity analysis of the small-molecule assayed proteome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Exploration of the mechanism of traditional Chinese medicine by AI approach using unsupervised machine learning for cellular functional similarity of compounds in heterogeneous networks, XiaoErFuPi granules as an example
Feifei Guo ... Hongjun Yang
Pharmacological Research | VOL. 160
Feifei Guo, et. al.Feifei Guo ... Hongjun Yang
17 Jul 2020
Pharmacological Research | VOL. 160

Repurposing of FDA-approved drugs against cancer - focus on metastasis.
Béla Ozsvári ... Rebecca Lamb
Aging | VOL. 8
Béla Ozsvári, et. al.Béla Ozsvári ... Rebecca Lamb
02 Apr 2016
Aging | VOL. 8

The response rate of alternative treatments for drugs approved on the basis of response rate
Alyson Haslam ... Vinay Prasad
International Journal of Cancer | VOL. 148
Alyson Haslam, et. al.Alyson Haslam ... Vinay Prasad
26 Aug 2020
International Journal of Cancer | VOL. 148

Screening of antibacterial compounds with novel structure from the FDA approved drugs using machine learning methods.
Wen-Xing Li ... Peng-Peng Yang
Aging | VOL. 14
Wen-Xing Li, et. al.Wen-Xing Li ... Peng-Peng Yang
12 Feb 2022
Aging | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Large-scale bioactivity analysis of the small-molecule assayed proteome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE