Active learning (AL) is to design label-efficient algorithms by labeling the most representative samples. It reduces annotation cost and attracts increasing attention from the community. However, previous AL methods suffer from the inadequacy of annotations and unreliable uncertainty estimation. Moreover, we find that they ignore the intra-diversity of selected samples, which leads to sampling redundancy. In view of these challenges, we propose an inductive state-relabeling adversarial AL model (ISRA) that consists of a unified representation generator, an inductive state-relabeling discriminator, and a heuristic clique rescaling module. The generator introduces contrastive learning to leverage unlabeled samples for self-supervised training, where the mutual information is utilized to improve the representation quality for AL selection. Then, we design an inductive uncertainty indicator to learn the state score from labeled data and relabel unlabeled data with different importance for better discrimination of instructive samples. To solve the problem of sampling redundancy, the heuristic clique rescaling module measures the intra-diversity of candidate samples and recurrently rescales them to select the most informative samples. The experiments conducted on eight datasets and two imbalanced scenarios show that our model outperforms the previous state-of-the-art AL methods. As an extension on the cross-modal AL task, we apply ISRA to the image captioning and it also achieves superior performance.
Read full abstract