These days, clustering is one of the most classical themes to analyze data structures in machine learning and pattern recognition. Recently, the anchor-based graph has been widely adopted to promote the clustering accuracy of plentiful graph-based clustering techniques. In order to achieve more satisfying clustering performance, we propose a novel clustering approach referred to as the progressive self-supervised clustering method with novel category discovery (PSSCNCD), which consists of three separate procedures specifically. First, we propose a new semisupervised framework with novel category discovery to guide label propagation processing, which is reinforced by the parameter-insensitive anchor-based graph obtained from balanced K -means and hierarchical K -means (BKHK). Second, we design a novel representative point selected strategy based on our semisupervised framework to discover each representative point and endow pseudolabel progressively, where every pseudolabel hypothetically corresponds to a real category in each self-supervised label propagation. Third, when sufficient representative points have been found, the labels of all samples will be finally predicted to obtain terminal clustering results. In addition, the experimental results on several toy examples and benchmark data sets comprehensively demonstrate that our method outperforms other clustering approaches.
Read full abstract