Evaluation of the molecular similarity and property prediction for QSAR purposes

Borka Jerman-Blažič,Milan Randić,Irena Fabič-Petrač

doi:10.1016/0169-7439(89)80064-4

Abstract

Abstract A string comparison method has been developed and applied to the measurement of the molecular similarity of chemical structures. The molecular structures were encoded as sequences of numbers representing counts of paths of different lengths. The similarity index between two compounds was calculated as the difference between the gains of information derived through a comparison of the corresponding molecular path sequences. Strings representing ordering of compounds according to their similarity were used for clustering of the elements of the data set studied. The classification of an unknown object into one of the clusters obtained and the properties associated with the cluster were used as a source for prediction of some molecular properties. The method is illustrated on two groups of compounds, barbiturates and benzamidines. The algorithms and the programs used are described briefly.

Full Text