Abstract

Recent developments in high-throughput screening and combinatorial chemistry have generated interest in experimental design methods to select subsets of molecules from large chemical databases. In this manuscript three methods for selecting molecules from large databases are described: edge designs, spread designs, and coverage designs. Two algorithms with linear time complexity that approximate spread and coverage designs are described. These algorithms can be threaded for multiprocessor systems, are compatible with any definition of molecular distance, and may be applied to very large chemical databases. For example, ten thousand molecules were selected using the maximum dissimilarity approximation to a spread design from a sixty-dimensional simulated molecular database of one million molecules in approximately 6 h on a UNIX workstation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call