Abstract

Acquiring disparity maps by dense stereo matching is one of the most important methods for producing digital surface models. However, the characteristics of optical satellite imagery, including significant occlusions and long baselines, increase the challenges of dense matching. In this study, we propose an end-to-end edge-guided multi-scale matching network (EGMS-Net) tailored for optical satellite stereo image pairs. Using small convolutional filters and residual blocks, the EGMS-Net captures rich high-frequency signals during the initial feature extraction phase. Subsequently, pyramid features are derived through efficient down-sampling and consolidated into cost volumes. To regularize these cost volumes, we design a top–down multi-scale fusion network that integrates an attention mechanism. Finally, we innovate the use of trainable guided filter layers in disparity refinement to improve edge detail recovery. The network is trained and evaluated using the Urban Semantic 3D and WHU-Stereo datasets, with subsequent analysis of the disparity maps. The results show that the EGMS-Net provides superior results, achieving endpoint errors of 1.515 and 2.459 pixels, respectively. In challenging scenarios, particularly in regions with textureless surfaces and dense buildings, our network consistently delivers satisfactory matching performance. In addition, EGMS-Net reduces training time and increases network efficiency, improving overall results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call