Abstract

Multiple clusterings can explore the same set of data from different perspectives by discovering different and meaningful clusterings. However, most, if not all, of the existing approaches overwhelmingly focus on the diversity between clustering subspaces, and pay much less attention on the salience of the subspaces. As a consequence, the quality of the produced clusterings is an understudied aspect of the problem. Furthermore, existing methods cannot explain the unique internal subspace structure of each clustering, and cannot incorporate multi-facet knowledge to generate different clusterings. In this paper, we propose a solution named <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">iMClusts</b> ( <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</u> nterpretable <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</u> ultiple <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Clust</u> ering <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">s</u> by diversified attention). iMClusts makes use of the expressive representational power of deep autoencoders and multi-head attention to generate multiple salient embedding matrices, and multiple clusterings therein. In addition, it leverages multi-facet knowledge and enhances the diversity between heads to boost the quality and diversity of multiple clusterings. Experimental results on benchmark datasets show that iMClusts can generate multiple clusterings with quality, interpretability, and diversity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call