Abstract

Subspace clustering is the process of identifying clusters with objects similar in subsets of attributes defining subspaces. The three major challenges faced by subspace clustering are: firstly, the subspace clustering algorithms explore exponential number of subspaces which possibly contain redundant clusters. This challenge is handled by a rough set based approach called interesting subspace clustering, (ISC) algorithm that improves the efficiency of the process by pre-pruning the uninteresting subspaces and identify dense clusters only in interesting subspaces. Secondly, enormous number of subspace clusters are generated which makes their interpretation difficult. This is addressed by a summarization algorithm, Similarity connectedness based Clustering on subspace Clusters, (SCoC) that generates compact set of high dimensional summarized subspace clusters based on the novel concept of Similarity Connectedness. Finally, the problem of density divergence while forming subspace clusters on different dimensionality is dealt successfully in subspace clustering with density variation algorithm so as to produce high quality clusters using appropriate density thresholds based on the spread of the data in the given subspace. The solutions for the above challenges proposed by authors are orthogonal to one another and hence, in this paper the authors propose to hybridize them. The first hybridization approach, Improved-ISC, achieves better quality subspace clusters and efficiency in exploration of subspaces. The second hybridization approach, Cascaded-SCoC algorithm, achieves compact set of improved quality subspace clusters. Both the algorithms outperform the existing algorithms in terms of quality and conciseness of the resulted clusters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call