Most existing change detection (CD) methods target homogeneous images. However, in real-world scenarios like disaster management, where CD is urgent and pre-changed and post-changed images are typical of different modalities, significant challenges arise for multimodal change detection (MCD). One challenge is that bi-temporal image pairs, sourced from distinct sensors, may cause an image domain gap. Another issue surfaces when multimodal bi-temporal image pairs require collaborative input from domain experts who are specialized among different image fields for pixel-level annotation, resulting in scarce annotated samples. To address these challenges, this paper proposes a novel self-supervised difference contrast learning framework (Self-DCF). This framework facilitates networks training without labelled samples by automatically exploiting the feature information inherent in bi-temporal imagery to supervise each other mutually. Additionally, a Unified Mapping Unit reduces the domain gap between different modal images. The efficiency and robustness of Self-DCF are validated on five popular datasets, outperforming state-of-the-art algorithms.
Read full abstract