Abstract

High throughput screening (HTS) interrogates compound libraries to find those that are “active” in an assay. To better understand compound behavior in HTS, we assessed an existing binomial survivor function (BSF) model of “frequent hitters” using 872 publicly available HTS data sets. We found large numbers of “infrequent hitters” using this model leading us to reject the BSF for identifying “frequent hitters.” As alternatives, we investigated generalized logistic, gamma, and negative binomial distributions as models for compound behavior. The gamma model reduced the proportion of both frequent and infrequent hitters relative to the BSF. Within this data set, conclusions about individual compound behavior were limited by the number of times individual compounds were tested (1–1613 times) and disproportionate testing of some compounds. Specifically, most tests (78%) were on a 309,847-compound subset (17.6% of compounds) each tested ≥ 300 times. We concluded that the disproportionate retesting of some compounds represents compound repurposing at scale rather than drug discovery. The approach to drug discovery represented by these 872 data sets characterizes the assays well by challenging them with many compounds while each compound is characterized poorly with a single assay. Aggregating the testing information from each compound across the multiple screens yielded a continuum with no clear boundary between normal and frequent hitting compounds.

Highlights

  • High throughput screening (HTS) interrogates compound libraries to find those that are “active” in an assay

  • In the development of High Throughput Screening (HTS) some compounds were noted to respond f­requently[3] leading to the concept of frequent ­hitters[3,4,5,6] and pan assay interference compounds (PAINs)[7,8,9]

  • The underpinning instrumental technologies of HTS have been influential for increasing the scale achievable in routine laboratory work and these technologies are widely deployed in the form of plate readers, lab robotics, and compound libraries accessible to r­ esearchers[20]

Read more

Summary

Results and discussion

GABA appears multiple times in compound libraries and was tested 893 times in the 872 data sets It appears four times in the AID175 screen under four different SID (substance identification) numbers which are all mapped to a single CID. Binomial models based on screen level probabilities (Fig. 1) do not predict how hits distribute among the compounds in a screen These probabilities are not equal due to the chemical properties of the compounds tested. Across the 872 screens, compounds were repeatedly tested but subsequent analysis does not appear to extend much beyond single screens This leads to multiple retesting of frequent hitters, PAINs, potential PAINs, and otherwise promiscuous compounds as well as missing an opportunity for discovering new scaffolds. It is unclear whether they should be purged from libraries or built into smaller scale libraries as a pre-HTS compound repurposing stage

Conclusions
Methods
Data and code availability

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.