Discovering novel visual categories, particularly in recognition tasks, is a prominent area of research in artificial intelligence. Visual data, such as images and videos, is inherently high-dimensional and they often exhibit low-dimensional subspace structures, making the task of uncovering novel low-dimensional subspace structures a substantial challenge. To address this challenge, we develop a novel subspace discovery method, consisting of four steps. First, we introduce a base-expressive network (BENet) designed to learn coefficients that express the relationships between samples and bases. Additionally, these coefficients are normalized as a probability distribution using the softmax function. Second, to enhance this process, we propose a Contrastive Subspace Distribution Learning (CSDL) technique, utilizing Rényi Divergence. CSDL combines self-supervised and supervised contrastive distances between the normalized coefficients, considering all samples and labeled samples, respectively. Third, the base-expressive coefficients are subsequently computed by the learned SENet, which effectively transfers subspace information from labeled samples to unlabeled ones. Finally, we employ spectral clustering on the predicted coefficients to automatically assign categories to unlabeled samples. Extensive experimental results demonstrate that our approach surpasses related methods in the realms of category discovery, unsupervised clustering, and semi-supervised (subspace) clustering.
Read full abstract