Statistical models for identifying frequent hitters in high throughput screening

Samuel Goodwin,Golnaz Shahtahmassebi,Quentin S Hanley

doi:10.1038/s41598-020-74139-0

Samuel Goodwin, Golnaz Shahtahmassebi + Show 1 more

Open Access

https://doi.org/10.1038/s41598-020-74139-0

Copy DOI

Journal: Scientific Reports	Publication Date: Oct 14, 2020
Citations: 6	License type: open-access

Affiliation: Nottingham Trent University

Abstract

High throughput screening (HTS) interrogates compound libraries to find those that are “active” in an assay. To better understand compound behavior in HTS, we assessed an existing binomial survivor function (BSF) model of “frequent hitters” using 872 publicly available HTS data sets. We found large numbers of “infrequent hitters” using this model leading us to reject the BSF for identifying “frequent hitters.” As alternatives, we investigated generalized logistic, gamma, and negative binomial distributions as models for compound behavior. The gamma model reduced the proportion of both frequent and infrequent hitters relative to the BSF. Within this data set, conclusions about individual compound behavior were limited by the number of times individual compounds were tested (1–1613 times) and disproportionate testing of some compounds. Specifically, most tests (78%) were on a 309,847-compound subset (17.6% of compounds) each tested ≥ 300 times. We concluded that the disproportionate retesting of some compounds represents compound repurposing at scale rather than drug discovery. The approach to drug discovery represented by these 872 data sets characterizes the assays well by challenging them with many compounds while each compound is characterized poorly with a single assay. Aggregating the testing information from each compound across the multiple screens yielded a continuum with no clear boundary between normal and frequent hitting compounds.

Highlights

High throughput screening (HTS) interrogates compound libraries to find those that are “active” in an assay
In the development of High Throughput Screening (HTS) some compounds were noted to respond frequently[3] leading to the concept of frequent hitters[3,4,5,6] and pan assay interference compounds (PAINs)[7,8,9]
The underpinning instrumental technologies of HTS have been influential for increasing the scale achievable in routine laboratory work and these technologies are widely deployed in the form of plate readers, lab robotics, and compound libraries accessible to r esearchers[20]

Summary

Results and discussion

GABA appears multiple times in compound libraries and was tested 893 times in the 872 data sets It appears four times in the AID175 screen under four different SID (substance identification) numbers which are all mapped to a single CID. Binomial models based on screen level probabilities (Fig. 1) do not predict how hits distribute among the compounds in a screen These probabilities are not equal due to the chemical properties of the compounds tested. Across the 872 screens, compounds were repeatedly tested but subsequent analysis does not appear to extend much beyond single screens This leads to multiple retesting of frequent hitters, PAINs, potential PAINs, and otherwise promiscuous compounds as well as missing an opportunity for discovering new scaffolds. It is unclear whether they should be purged from libraries or built into smaller scale libraries as a pre-HTS compound repurposing stage

Conclusions

Methods

Data and code availability

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Statistical models for identifying frequent hitters in high throughput screening

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Post-HTS case report and structural alert: Promiscuous 4-aroyl-1,5-disubstituted-3-hydroxy-2H-pyrrol-2-one actives verified by ALARM NMR.
Jayme L Dahlin ... J Willem M Nissink
Bioorganic & medicinal chemistry letters | VOL. 25
Jayme L Dahlin, et. al.Jayme L Dahlin ... J Willem M Nissink
10 Aug 2015
Bioorganic & medicinal chemistry letters | VOL. 25

Quantification of frequent-hitter behavior based on historical high-throughput screening data.
J Willem M Nissink ... Sam Blackburn
Future Medicinal Chemistry | VOL. 6
J Willem M Nissink, et. al.J Willem M Nissink ... Sam Blackburn
01 Jun 2014
Future Medicinal Chemistry | VOL. 6

Understanding False Positives in Reporter Gene Assays: in Silico Chemogenomics Approaches To Prioritize Cell-Based HTS Data
Thomas J Crisman ... Christian N Parker
Journal of Chemical Information and Modeling | VOL. 47
Thomas J Crisman, et. al.Thomas J Crisman ... Christian N Parker
01 Jul 2007
Journal of Chemical Information and Modeling | VOL. 47

Computational prediction of frequent hitters in target-based and cell-based assays
Conrad Stork ... Johannes Kirchmair
Artificial Intelligence in the Life Sciences | VOL. 1
Conrad Stork, et. al.Conrad Stork ... Johannes Kirchmair
08 Aug 2021
Artificial Intelligence in the Life Sciences | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Statistical models for identifying frequent hitters in high throughput screening

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports