Abstract

In this article, we propose a novel model for constrained clustering, namely, the dissimilarity propagation-guided graph-Laplacian principal component analysis (DP-GLPCA). By fully utilizing a limited number of weakly supervisory information in the form of pairwise constraints, the proposed DP-GLPCA is capable of capturing both the local and global structures of input samples to exploit their characteristics for excellent clustering. More specifically, we first formulate a convex semisupervised low-dimensional embedding model by incorporating a new dissimilarity regularizer into GLPCA (i.e., an unsupervised dimensionality reduction model), in which both the similarity and dissimilarity between low-dimensional representations are enforced with the constraints to improve their discriminability. An efficient iterative algorithm based on the inexact augmented Lagrange multiplier is designed to solve it with the global convergence guaranteed. Furthermore, we innovatively propose to propagate the cannot-link constraints (i.e., dissimilarity) to refine the dissimilarity regularizer to be more informative. The resulting DP model is iteratively solved, and we also prove that it can converge to a Karush-Kuhn-Tucker point. Extensive experimental results over nine commonly used benchmark data sets show that the proposed DP-GLPCA can produce much higher clustering accuracy than state-of-the-art constrained clustering methods. Besides, the effectiveness and advantage of the proposed DP model are experimentally verified. To the best of our knowledge, it is the first time to investigate DP, which is contrast to existing pairwise constraint propagation that propagates similarity. The code is publicly available at https://github.com/jyh-learning/DP-GLPCA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call