Finding Subspace Clusters Using Ranked Neighborhoods

Emin Aksehirli,Matthijs Van Leeuwen,Bart Goethals,Siegfried Nijssen

doi:10.1109/icdmw.2015.202

Abstract

Clustering high dimensional datasets is challenging due to the curse of dimensionality. One approach to address this challenge is to search for subspace clusters, i.e., clusters present in subsets of attributes. Recently the cartification algorithm was proposed to find such subspace clusters. The distinguishing feature of this algorithm is that it operates on a neighborhood database, in which for every object only the identities of the k closest objects are stored. Cartification was shown to produce better results than other state-of-the-art subspace clustering algorithms, however, which clusters it detects was also found to depend heavily on the setting of the parameters. In other words, it is not robust to input parameters. In this paper, we propose a new approach called ranked cartification that produces more robust results than ordinary cartification. We develop a transformation that creates ranked matrices instead of neighborhood databases, we identify clusters in these ranked matrices. We demonstrate that this method is more robust than cartification in terms of cluster detection.

Full Text