Abstract

A significant challenge in high‐throughput screening (HTS) campaigns is the identification of assay technology interference compounds. A Compound Interfering with an Assay Technology (CIAT) gives false readouts in many assays. CIATs are often considered viable hits and investigated in follow‐up studies, thus impeding research and wasting resources. In this study, we developed a machine‐learning (ML) model to predict CIATs for three assay technologies. The model was trained on known CIATs and non‐CIATs (NCIATs) identified in artefact assays and described by their 2D structural descriptors. Usual methods identifying CIATs are based on statistical analysis of historical primary screening data and do not consider experimental assays identifying CIATs. Our results show successful prediction of CIATs for existing and novel compounds and provide a complementary and wider set of predicted CIATs compared to BSF, a published structure‐independent model, and to the PAINS substructural filters. Our analysis is an example of how well‐curated datasets can provide powerful predictive models despite their relatively small size.

Highlights

  • High-throughput screening (HTS)[1,2] is a widely used approach in lead discovery

  • Compounds were classified into Compound Interfering with an Assay Technology (CIAT) and NCIATs

  • When a large dataset is available, a statistical method such as the Binomial Survivor Function (BSF) score is efficient, compounds need to be tested a significant number of times for the score to outperform random forest classification (RFC)

Read more

Summary

Introduction

High-throughput screening (HTS)[1,2] is a widely used approach in lead discovery. This method implies that activity data for hundreds of thousands of compounds can be generated in very little time. We present a novel random forest classification (RFC) model that predicts assay technology interference from molecular structures We compare this model with the BSF score and the results from applying PAINS filters. The latter comparison is done to investigate the ability of the PAINS filters to identify CIATs. In contrast to methods which use HTS primary screening results as input data, the datasets used for the RFC models contain results from historical counter-screen assays, which are used to experimentally rule out assay technology interference mechanisms from primary HTS datasets. Three extensively applied HTS assay technologies, AlphaScreen,[22] FRET,[23] (Fçrster resonance energy transfer) and TR-FRET[24] (time-resolved fluorescence resonance energy transfer) are investigated

Results and Discussion
Conclusions
Conflict of interest
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call