Abstract

This paper addresses the challenge of poor cross-domain generalization performance in deep learning methods for stereo matching, particularly when dealing with unseen scenes or disparity maps lacking ground-truth information. To overcome this issue, we propose a self-supervised network called SANet. The network integrates a lightweight algorithm, AANet_Edge, which is based on depth edges optimization. In SANet, we combine AANet_Edge with a novel algorithm called SDCO, which efficiently extracts depth edges using a segment structure and employs a two-layer optimization framework to generate accurate dense disparity maps. These maps are then utilized for pixel-by-pixel supervised training. Furthermore, SANet incorporates multi-scale reconstructed left maps and multi-scale edges-aware modules to learn the structural features of the input image. To evaluate the effectiveness of SANet, comprehensive experiments are conducted on two standard benchmark datasets, namely KITTI 2012 and KITTI 2015. The experimental results demonstrate that SANet produces accurate disparity maps for unseen scenes or limited images and achieves high cross-domain generalization performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call