The unique imaging modality of synthetic aperture radar (SAR) has posed significant challenges for object detection, making it more complex to acquire and interpret than optical images. Recently, numerous studies have proposed cross-domain adaptive methods based on convolutional neural networks (CNNs) to promote SAR object detection using optical data. However, existing cross-domain methods focus on image features, lack improvement on input data, and ignore the valuable supervision provided by few labeled SAR images. Therefore, we propose a semi-supervised cross-domain object detection framework that uses optical data and few SAR data to achieve knowledge transfer for SAR object detection. Our method focuses on the data processing aspects to gradually reduce the domain shift at the image, instance, and feature levels. First, we propose a data augmentation method of image mixing and instance swapping to generate a mixed domain that is more similar to the SAR domain. This method fully utilizes few SAR annotation information to reduce domain shift at image and instance levels. Second, at the feature level, we propose an adaptive optimization strategy to filter out mixed domain samples that significantly deviate from the SAR feature distribution to train feature extractor. In addition, we employ Vision Transformer (ViT) as feature extractor to handle the global feature extraction of mixed images. We propose a detection head based on normalized Wasserstein distance (NWD) to enhance objects with smaller effective regions in SAR images. The effectiveness of our proposed method is evaluated on public SAR ship and oil tank datasets.
Read full abstract