Abstract

High throughput screening (HTS) assesses compound libraries for “activity” using target assays. A subset of HTS data contains a large number of sample measurements replicated a small number of times providing an opportunity to introduce the distribution of standard deviations (DSD). Applying the DSD to some HTS data sets revealed signs of bias in some of the data and discovered a sub-population of compounds exhibiting high variability which may be difficult to screen. In the data examined, 21% of 1189 such compounds were pan-assay interference compounds. This proportion reached 57% for the most closely related compounds within the sub-population. Using the DSD, large HTS data sets can be modelled in many cases as two distributions: a large group of nearly normally distributed “inactive” compounds and a residual distribution of “active” compounds. The latter were not normally distributed, overlapped inactive distributions – on both sides –, and were larger than typically assumed. As such, a large number of compounds are being misclassified as “inactive” or are invisible to current methods which could become the next generation of drugs. Although applied here to HTS, it is applicable to data sets with a large number of samples measured a small number of times.

Highlights

  • A range of procedures have been described to control spatial effects in plates[2,5,7,8], and further guidance exists for many aspects of hit detection[7,9]

  • Clear warnings may be found in the literature of “frequent hitters”[15] and Pan-Assay Interference compoundS (PAINS)[16,17]

  • To test High throughput screening (HTS) data for homoscedasticity and consistency with normal statistics, the average and standard deviation were computed for each compound in three sets of HTS results and histograms constructed (Fig. 2)

Read more

Summary

Introduction

A range of procedures have been described to control spatial effects in plates[2,5,7,8], and further guidance exists for many aspects of hit detection[7,9]. The usual procedure for assigning a compound as “active” is by consideration of mean or median values (for N > 1) or by comparison of single measurements with a mean or median value (N = 1). This is, only one of many valid ways to make a statistical inference. A better view of this activity is to state the “inactive” model, test the validity of that model, and assign the likelihood a particular compound’s measurement conforms to that model This can remove thresholds entirely from experimental design and presentation. Problematic compounds have been inferred from cross referencing multiple studies

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call