Abstract
Recent developments in high-throughput screening and combinatorial chemistry have generated interest in experimental design methods to select subsets of molecules from large chemical databases. In this manuscript three methods for selecting molecules from large databases are described: edge designs, spread designs, and coverage designs. Two algorithms with linear time complexity that approximate spread and coverage designs are described. These algorithms can be threaded for multiprocessor systems, are compatible with any definition of molecular distance, and may be applied to very large chemical databases. For example, ten thousand molecules were selected using the maximum dissimilarity approximation to a spread design from a sixty-dimensional simulated molecular database of one million molecules in approximately 6 h on a UNIX workstation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Chemical Information and Computer Sciences
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.