Contextuality Helps Representation Learning for Generalized Category Discovery
This paper introduces a novel approach to Generalized Category Discovery (GCD) by leveraging the concept of contextuality to enhance the identification and classification of categories in unlabeled datasets. Drawing inspiration from human cognition's ability to recognize objects within their context, we propose a dual-context based method. Our model integrates two levels of contextuality: instance-level, where nearest-neighbor contexts are utilized for contrastive learning, and cluster-level, employing prototypical contrastive learning based on category prototypes. The integration of the contextual information effectively improves the feature learning and thereby the classification accuracy of all categories, which better deals with the real-world datasets. Different from the traditional semi-supervised and novel category discovery techniques, our model focuses on a more realistic and challenging scenario where both known and novel categories are present in the unlabeled data. Extensive experimental results on several benchmark data sets demonstrate that the proposed model outperforms the state-of-the-art. Code is available at: https://github.com/Clarence-CV/ Contexuality-GCD
- Conference Article
53
- 10.1109/iccv48922.2021.00065
- Oct 1, 2021
This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories. We present a generic, end-to-end framework to jointly learn a reliable representation and assign clusters to unlabelled data. To avoid over-fitting the learnt embedding to labelled data, we take inspiration from self-supervised representation learning by noise-contrastive estimation and extend it to jointly handle labelled and unlabelled data. In particular, we propose using category discrimination on labelled data and cross-modal discrimination on multi-modal data to augment instance discrimination used in conventional contrastive learning approaches. We further employ Winner-Take-All (WTA) hashing algorithm on the shared representation space to generate pairwise pseudo labels for unlabelled data to better predict cluster assignments. We thoroughly evaluate our framework on large-scale multi-modal video benchmarks Kinetics-400 and VGG-Sound, and image benchmarks CIFAR10, CIFAR100 and ImageNet, obtaining state-of-the-art results.
- Research Article
- 10.1609/aaai.v39i20.35414
- Apr 11, 2025
- Proceedings of the AAAI Conference on Artificial Intelligence
This paper addresses generalized category discovery (GCD), the task of clustering unlabeled data from potentially known or unknown categories with the help of labeled instances from each known category. Compared to traditional semi-supervised learning, GCD is more challenging because unlabeled data could be from novel categories not appearing in labeled data. Current state-of-the-art methods typically learn a parametric classifier assisted by self-distillation. While being effective, these methods do not make use of cross-instance similarity to discover class-specific semantics which are essential for representation learning and category discovery. In this paper, we revisit the association-based paradigm and propose a Prior-constrained Association Learning method to capture and learn the semantic relations within data. In particular, the labeled data from known categories provides a unique prior for the association of unlabeled data. Unlike previous methods that only adopts the prior as a pre or post-clustering refinement, we fully incorporate the prior into the association process, and let it constrain the association towards a reliable grouping outcome. The estimated semantic groups are utilized through non-parametric prototypical contrast to enhance the representation learning. A further combination of both parametric and non-parametric classification complements each other and leads to a model that outperforms existing methods by a significant margin. On multiple GCD benchmarks, we perform extensive experiments and validate the effectiveness of our proposed method.
- Research Article
6
- 10.1016/j.inffus.2024.102547
- Jun 25, 2024
- Information Fusion
Prediction consistency regularization for Generalized Category Discovery
- Conference Article
184
- 10.1109/cvpr52688.2022.00734
- Jun 1, 2022
In this paper, we consider a highly general image recognition setting wherein, given a labelled and unlabelled set of images, the task is to categorize all images in the unlabelled set. Here, the unlabelled images may come from labelled classes or from novel ones. Existing recognition methods are not able to deal with this setting, because they make several restrictive assumptions, such as the unlabelled instances only coming from known – or unknown – classes, and the number of unknown classes being known a-priori. We address the more unconstrained setting, naming it ‘Generalized Category Discovery’, and challenge all these assumptions. We first establish strong baselines by taking state-of-the-art algorithms from novel category discovery and adapting them for this task. Next, we propose the use of vision transformers with contrastive representation learning for this open-world setting. We then introduce a simple yet effective semi-supervised k-means method to cluster the unlabelled data into seen and unseen classes automatically, substantially outperforming the baselines. Finally, we also propose a new approach to estimate the number of classes in the unlabelled data. We thoroughly evaluate our approach on public datasets for generic object classification and on fine-grained datasets, leveraging the recent Semantic Shift Benchmark suite. Code: https://www.robots.ox.ac.uk/~vgg/research/gcd
- Book Chapter
- 10.1007/978-3-031-44198-1_21
- Jan 1, 2023
GII: A Unified Approach to Representation Learning in Open Set Recognition with Novel Category Discovery
- Research Article
- 10.3390/sym16091098
- Aug 23, 2024
- Symmetry
Oraclebone characters (OBCs) are crucial for understanding ancient Chinese history, but existing recognition methods only recognize known categories in labeled data, neglecting novel categories in unlabeled data. This work introduces a novel approach to discovering new OBC categories in unlabeled data through generalized category discovery. We address the challenges posed by OBCs’ instinctive characteristics, such as misleading contrastive views from random cropping, sub-optimal learned representation, and insufficient supervision for unlabeled data. Our method features a symmetrical structure enhanced by character component distillation and self-merged pseudo-label. We utilize random geometric transforms to create symmetrical contrastive views to avoid misleading views. Then, the proposed character component distillation procedure optimizes symmetrical shared character components for better transferable representation. Finally, we construct a self-merged pseudo-label from the model and a symmetrical teacher model to provide stable and robust supervision for unlabeled data. Extensive experiments validate the superiority of our method in recognizing ’All’ and ’Novel’ OBC categories, providing an effective tool to aid OBC researchers.
- Conference Article
- 10.24963/ijcai.2024/532
- Aug 1, 2024
Generalized Category Discovery (GCD) presents a realistic and challenging problem in open-world learning. Given a par- tially labeled dataset, GCD aims to categorize unlabeled data by leveraging visual knowledge from the labeled data, where the unlabeled data includes both known and unknown classes. Existing methods based on parametric/non-parametric classi- fiers attempt to generate pseudo-labels/relationships for the unlabeled data to enhance representation learning. However, the lack of ground-truth labels for novel classes often leads to noisy pseudo-labels/relationships, resulting in suboptimal representation learning. This paper introduces a novel method using Nearest Neighbor Distance-aware Label Consistency sample selection. It creates class-consistent subsets for novel class sample clusters from the current GCD method, acting as “pseudo-labeled sets” to mitigate representation bias. We propose progressive supervised representation learning with selected samples to optimize the trade-off between quantity and purity in each subset. Our method is versatile and appli- cable to various GCD methods, whether parametric or non- parametric. We conducted extensive experiments on multiple generic and fine-grained image classification datasets to eval- uate the effectiveness of our approach. The results demon- strate the superiority of our method in achieving improved performance in generalized category discovery tasks.
- Book Chapter
- 10.3233/faia251176
- Oct 21, 2025
- Frontiers in artificial intelligence and applications
Generalized Category Discovery (GCD) faces the challenge of categorizing unlabeled data containing both known and novel classes, given only labels for known classes. Previous studies often treat each class independently, neglecting the inherent inter-class relations. Obtaining such inter-class relations directly presents a significant challenge in real-world scenarios. To address this issue, we propose ReLKD, an end-to-end framework that effectively exploits implicit inter-class relations and leverages this knowledge to enhance the classification of novel classes. ReLKD comprises three key modules: a target-grained module for learning discriminative representations, a coarse-grained module for capturing hierarchical class relations, and a distillation module for transferring knowledge from the coarse-grained module to refine the target-grained module’s representation learning. Extensive experiments on four datasets demonstrate the effectiveness of ReLKD, particularly in scenarios with limited labeled data. The code for ReLKD is available at https://github.com/ZhouF-ECNU/ReLKD.
- Research Article
3
- 10.1016/j.knosys.2024.112214
- Jul 14, 2024
- Knowledge-Based Systems
Efficient Representation Learning for Generalized Category Discovery
- Research Article
1
- 10.1109/tnnls.2025.3597074
- Dec 1, 2025
- IEEE transactions on neural networks and learning systems
This study addresses the problem of generalized category discovery (GCD), an advanced and challenging semi-supervised learning scenario that deals with unlabeled data from both known and novel categories. Although recent research has effectively engaged with this issue, these studies typically map features into Euclidean space, which fails to maintain the latent semantic hierarchy of the training samples effectively. This limitation restricts the exploration of more detailed and rich information and degrades the performance in discovering new categories. The emerging field of hyperbolic representation learning suggests that hyperbolic geometry could be advantageous for extracting semantic information to tackle this problem. Motivated by this, we proposed hyperbolic hierarchical representation learning for GCD (HypGCD). Specifically, HypGCD enhances representations in hyperbolic space, building upon the Euclidean space representation from two perspectives: instance-class level and instance-instance level. At the instance-class level, HypGCD endeavors to construct well-defined clusters, with each sample forming a robust hierarchical cluster structure. Concurrently, at the instance-instance level, HypGCD anticipates that a subset of samples will display a tree-like structure in local space, which aligns more closely with real-world scenarios. Finally, HypGCD optimizes the Euclidean and hyperbolic space collectively to obtain refined features. Additionally, we show that HypGCD is exceptionally effective, achieving state-of-the-art (SOTA) results on several datasets. The code is available at https://github.com/DuannYu/HypGCD.
- Research Article
1
- 10.1016/j.image.2024.117195
- Aug 17, 2024
- Signal Processing: Image Communication
Feature extractor optimization for discriminative representations in Generalized Category Discovery
- Research Article
6
- 10.1609/aaai.v38i10.28959
- Mar 24, 2024
- Proceedings of the AAAI Conference on Artificial Intelligence
Generalized Category Discovery (GCD) is a crucial real-world task that aims to recognize both known and novel categories from an unlabeled dataset by leveraging another labeled dataset with only known categories. Despite the improved performance on known categories, current methods perform poorly on novel categories. We attribute the poor performance to two reasons: biased knowledge transfer between labeled and unlabeled data and noisy representation learning on the unlabeled data. The former leads to unreliable estimation of learning targets for novel categories and the latter hinders models from learning discriminative features. To mitigate these two issues, we propose a Transfer and Alignment Network (TAN), which incorporates two knowledge transfer mechanisms to calibrate the biased knowledge and two feature alignment mechanisms to learn discriminative features. Specifically, we model different categories with prototypes and transfer the prototypes in labeled data to correct model bias towards known categories. On the one hand, we pull instances with known categories in unlabeled data closer to these prototypes to form more compact clusters and avoid boundary overlap between known and novel categories. On the other hand, we use these prototypes to calibrate noisy prototypes estimated from unlabeled data based on category similarities, which allows for more accurate estimation of prototypes for novel categories that can be used as reliable learning targets later. After knowledge transfer, we further propose two feature alignment mechanisms to acquire both instance- and category-level knowledge from unlabeled data by aligning instance features with both augmented features and the calibrated prototypes, which can boost model performance on both known and novel categories with less noise. Experiments on three benchmark datasets show that our model outperforms SOTA methods, especially on novel categories. Theoretical analysis is provided for an in-depth understanding of our model in general. Our code and data are available at https://github.com/Lackel/TAN.