Abstract

Clustering is an unsupervised method that allows researchers to group objects and gather information about their relationships. In chemoinformatics, clustering enables hypotheses to be drawn about a compound's biological, chemical and physical property in comparison to another. We introduce a novel improved spectral clustering algorithm, proposed for chemical compound clustering, using multiple data sources. Tensor-based spectral methods, used in this paper, provide chemically appropriate and statistically significant results when attempting to cluster compounds from both the GSK-Chembl Malaria data set and the Zinc database. Spectral clustering algorithms based on the tensor method give robust results on the mid-size compound sets used here. The goal of this paper is to present the clustering of chemical compounds, using a tensor-based multi-view method which proves of value to the medicinal chemistry community. Our findings show compounds of extremely different chemotypes clustering together, this is a hint to the chemogenomics nature of our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call