Abstract

In large-scale virtual screening (VS) campaigns, data are often computed for millions of compounds to identify leads, but there remains the task of prioritizing VS "hits" for experimental assays and the dilemma of assessing true/false positives. We present two statistical methods for mining large databases: (1) a general scoring metric based on the VS signal-to-noise level within a compound neighborhood; (2) a neighborhood-based sampling strategy for reducing database size, in lieu of property-based filters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call