Abstract
For fingerprint searching using multiple active reference compounds, an information entropy-based similarity method is introduced as an alternative to conventional similarity coefficients and search strategies. The approach involves the determination of the fingerprint bit pattern entropy of a compound reference set and recalculation of the entropy following the addition of individual test compounds. If a database compound shares similar bit patterns with reference set molecules, adding this compound to the reference set only produces a small change in system entropy. By contrast, inclusion of a compound having a dissimilar fingerprint leads to a notable increase in entropy. Thus, database compounds can be screened for candidate molecules that do not cause significant changes in reference set fingerprint entropy. Compared to nearest neighbor methods, this approach has the computational advantage that it extracts reference set information only once prior to similarity searching. Test calculations on different compound data sets, fingerprints, and screening databases reveal that the ability of our entropy-based method to detect active compounds is often superior to data fusion techniques and Tanimoto similarity calculations.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have