An elusive goal in the field of chemoinformatics and molecular modeling has been the generation of a set of descriptors that, once calculated for a molecule, may be used in a wide variety of applications. Since such universal descriptors are generated free from external constraints, they are inherently independent of the data set in which they are employed. The realization of a set of universal descriptors would significantly streamline such chemoinformatics tasks as virtual high-throughout screening (VHTS) and toxicity profiling. The current study reports the derivation and validation of a potential set of universal descriptors, referred to as the 4D-fingerprints. The 4D-fingerprints are derived from the 4D-molecular similarity analysis. To evaluate the applicability of the 4D-fingerprints as universal descriptors, they are used to generate descriptive QSAR models for 5 independent training sets. Each of the training sets has been analyzed previously by several varying QSAR methods, and the results of the models generated using the 4D-fingerprints are compared to the results of the previous QSAR analyses. It was found that the models generated using the 4D-fingerprints are comparable in quality, based on statistical measures of fit and test set prediction, to the previously reported models for the other QSAR methods. This finding is particularly significant considering the 4D-fingerprints are generated independent of external constraints such as alignment, while the QSAR methods used for comparison all require an alignment analysis.
Read full abstract