Abstract

AbstractThis paper reports a comparison of calculated molecular properties and of 2D fragment bit‐strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity‐based selection and k‐means cluster‐based selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected subsets suggest: that the MaxMin subsets are noticeably superior to the k‐means subsets; that the property‐based descriptors are marginally superior to the fragment‐based descriptors; and that both approaches are noticeably superior to random selection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call