Abstract

In this paper, we tackle the problem of discovering new classes in unlabeled visual data given labeled data from disjoint classes. Existing methods typically first pre-train a model with labeled data, and then identify new classes in unlabeled data via unsupervised clustering. However, the labeled data that provide essential knowledge are often underexplored in the second step. The challenge is that the labeled and unlabeled examples are from non-overlapping classes, which makes it difficult to build a learning relationship between them. In this work, we introduce Open-Mix to mix the unlabeled examples from an open set and the labeled examples from known classes, where their non-overlapping labels and pseudo-labels are simultaneously mixed into a joint label distribution. OpenMix dynamically compounds examples in two ways. First, we produce mixed training images by incorporating labeled examples with unlabeled examples. With the benefit of unique prior knowledge in novel class discovery, the generated pseudo-labels will be more credible than the original unlabeled predictions. As a result, OpenMix helps preventing the model from overfitting on unlabeled samples that may be assigned with wrong pseudo-labels. Second, the first way encourages the unlabeled examples with high class-probabilities to have considerable accuracy. We introduce these examples as reliable anchors and further integrate them with un-labeled samples. This enables us to generate more combinations in unlabeled examples and exploit finer object relations among the new classes. Experiments on three classification datasets demonstrate the effectiveness of the proposed OpenMix, which is superior to state-of-the-art methods in novel class discovery.

Highlights

  • In this work, we attempt to address the new problem, called novel class discovery [6,7,9,10], where we are given labeled data of known classes and unlabeled data of novel classes

  • OpenMix can prevent the model from fitting on wrong pseudo-labels, thereby consistently improving the model performance; (2) OpenMix enables us to explore reliable anchors from unlabeled samples, which can be used to generate diverse smooth samples of new classes towards a more discriminative model; (3) This paper presents a simple baseline for novel class discovery, which can achieve competitive results; (4) Experiments conducted on three datasets show that our approach outperforms the state-of-the-art methods by a large margin in novel class discovery

  • Given the pre-trained model, we add a new classifier on the head of the convolutional neural network (CNN) and train the clustering model

Read more

Summary

Introduction

We attempt to address the new problem, called novel class discovery [6,7,9,10], where we are given labeled data of known (old) classes and unlabeled data of novel (new) classes. It is an open set problem where classes of unlabeled data are undefined previously and annotated samples of these novel classes are not available. The goal of novel class discovery is to identify new classes in unlabeled data with the support of knowledge of old classes To achieve this objective, existing methods [6,7,9,10] commonly follow a two-step learning strategy: 1) pre-train the model with labeled data to obtain basic discriminative ability; Unlabeled C4 C5 Mixed Labeled. The use of labeled data is much harder than in semi-supervised learning [2,14], due to the labeled and unlabeled samples are from disjoint classes

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call