Multiview clustering has received great attention and numerous subspace clustering algorithms for multiview data have been presented. However, most of these algorithms do not effectively handle high-dimensional data and fail to exploit consistency for the number of the connected components in similarity matrices for different views. In this article, we propose a novel consistency-induced multiview subspace clustering (CiMSC) to tackle these issues, which is mainly composed of structural consistency (SC) and sample assignment consistency (SAC). To be specific, SC aims to learn a similarity matrix for each single view wherein the number of connected components equals to the cluster number of the dataset. SAC aims to minimize the discrepancy for the number of connected components in similarity matrices from different views based on the SAC assumption, that is, different views should produce the same number of connected components in similarity matrices. CiMSC also formulates cluster indicator matrices for different views, and shared similarity matrices simultaneously in an optimization framework. Since each column of similarity matrix can be used as a new representation of the data point, CiMSC can learn an effective subspace representation for the high-dimensional data, which is encoded into the latent representation by reconstruction in a nonlinear manner. We employ an alternating optimization scheme to solve the optimization problem. Experiments validate the advantage of CiMSC over 12 state-of-the-art multiview clustering approaches, for example, the accuracy of CiMSC is 98.06% on the BBCSport dataset.
Read full abstract