Semi-Supervised Object Detection (SSOD) has emerged as a potent framework that leverages unlabeled data to reduce annotation costs while enhancing model performance. Nevertheless, existing SSOD methods primarily concentrate on the conventional localization of horizontal objects, overlooking the common arbitrary-oriented objects in aerial scenes. Extending semi-supervised learning to oriented object detection still encounters two major challenges: (1) There is an inconsistency between classification and localization in oriented object detectors, which means that pseudo-labels selected solely based on classification scores cannot properly represent the localization quality. (2) The static and monotonic pseudo-label selection process fails to dynamically filter pseudo-labels for different tasks and categories throughout the iteration process, inevitably introducing accumulated noise and bias into the supervisory signals for the student model. To mitigate this problems, we proposed a novel consistency-based semi-supervised oriented object detection framework, named CSLO-Det. Specifically, Task Alignment Learning (TAL) is proposed to improve the consistency in terms of both features and learning objectives, thereby facilitating a more robust selection of pseudo-labels. Noise-Resistant Pseudo-label Mining (NR-PM) is introduced to separately and dynamically exploits positives for the classification and localization task. Moreover, diverging from the conventional semi-supervised learning frameworks that utilize pseudo-boxes, our method applies pixel-level dense supervision during unsupervised training, which improves detection performance. Comprehensive experiment results reveal that the proposed CSLO-Det attains state-of-the-art performance under multiple semi-supervised datasets.
Read full abstract