DLCS: A deep learning-based Clustering solution without any clustering algorithm, Utopia?

Frédéric Ros,Rabia Riad

doi:10.1016/j.knosys.2024.111834

Abstract

Clustering is a process widely studied in the field of pattern recognition. Despite the existence of numerous algorithms and continuous innovation, there are still unresolved issues. Clusters can exhibit diverse characteristics in terms of size, shape, and density, while noise and outliers pose challenges. One major hurdle is discovering a natural partition without relying on parameters or heuristics. In this paper, we propose a deep-learning framework capable of simulating clustering algorithms without the need for parameter tuning or embedding heuristics. Instead of developing a novel clustering algorithm, our framework can learn to cluster data in a natural and intuitive way without the need for manual intervention. The framework takes a database as input, provides a simple diagnosis of cluster presence, and identifies the discovered clusters without requiring any input from the investigator. Two phases are involved: a mapping phase that uses a deep-learning model trained by machine-learning experts to convert the original distance data into a more user-friendly representation and a decoding phase that generates the final partitions. Our preliminary version is currently limited to small databases and only simulates the behavior of the K-means algorithm. We conducted simulation experiments using both synthetic and real datasets to showcase the effectiveness and applicability of our idea. A demonstration will be available on the author’s website11http://r-riad.net/..

Full Text