Bridging the gap with grad: Integrating active learning into semi-supervised domain generalization

Jingwei Li,Yuan Li,Jie Tan,Chengbao Liu

doi:10.1016/j.neunet.2023.12.017

Abstract

Domain generalization (DG) aims to generalize from a large amount of source data that are fully annotated. However, it is laborious to collect labels for all source data in practice. Some research gets inspiration from semi-supervised learning (SSL) and develops a new task called semi-supervised domain generalization (SSDG). Unlabeled source data is trained jointly with labeled one to significantly improve the performance. Nevertheless, different research adopts different settings, leading to unfair comparisons. Moreover, the initial annotation of unlabeled source data is random, causing unstable and unreliable training. To this end, we first specify the training paradigm, and then leverage active learning (AL) to handle the issues. We further develop a new task called Active Semi-supervised Domain Generalization (ASSDG), which consists of two parts, i.e., SSDG and AL. We delve deep into the commonalities of SSL and AL and propose a unified framework called Gradient-Similarity-based Sample Filtering and Sorting (GSSFS) to iteratively train the SSDG and AL parts. Gradient similarity is utilized to select reliable and informative unlabeled source samples for these two parts respectively. Our methods are simple yet efficient, and extensive experiments demonstrate that our methods can achieve the best results on the DG datasets in the low-data regime without bells and whistles.

Full Text