Remote sensing image change detection, being a pixel-level dense prediction task, requires both high speed and high accuracy. The redundancy within the models and detection errors, particularly missed detections, generally affect accuracy and merit further research. Moreover, the former also leads to a reduction in speed. To guarantee the efficiency of change detection, encompassing both speed and accuracy, a VMamba-based Multi-scale Feature Guiding Fusion Network (VMMCD) is proposed. This network is capable of promptly modeling global relationships and realizing multi-scale feature interaction. Specifically, the Mamba backbone is adopted to replace the commonly used CNN and Transformer backbones. By leveraging VMamba’s global modeling ability with linear computational complexity, the computational resources needed for extracting global features are reduced. Secondly, considering the characteristics of the VMamba model, a compact and efficient lightweight network architecture is devised. The aim is to reduce the model’s redundancy, thereby avoiding the extraction or introduction of interfering and redundant information. As a result, the speed and accuracy of the model are both enhanced. Finally, the Multi-scale Feature Guiding Fusion (MFGF) module is developed, which strengthens the global modeling ability of VMamba. Additionally, it enriches the interaction among multi-scale features to address the common issue of missed detections in changed areas. The proposed network achieves competitive results on three publicly available datasets—SYSU-CD, WHU-CD, and S2Looking—and surpasses the current state-of-the-art (SOTA) methods on the SYSU-CD dataset, with an F1 of 83.35% and IoU of 71.45%. Moreover, for inputs of 256×256 size, it is more than three times faster than the current SOTA VMamba-based change detection model. This outstanding achievement demonstrates the effectiveness of our proposed approach.
Read full abstract