Abstract

When training a stereo matching network with a single training dataset, the network may overly rely on the learned features of the single training dataset due to differences in the training dataset scenes, resulting in poor performance on all datasets. Therefore, feature consistency between matched pixels is a key factor in solving the network’s generalization ability. To address this issue, this paper proposed a more widely applicable stereo matching network that introduced whitening loss into the feature extraction module of stereo matching, and significantly improved the applicability of the network model by constraining the variation between salient feature pixels. In addition, this paper used a GRU iterative update module in the disparity update calculation stage, which expanded the model’s receptive field at multiple resolutions, allowing for precise disparity estimation not only in rich texture areas but also in low texture areas. The model was trained only on the Scene Flow large-scale dataset, and the disparity estimation was conducted on mainstream datasets such as Middlebury, KITTI 2015, and ETH3D. Compared with earlier stereo matching algorithms, this method not only achieves more accurate disparity estimation but also has wider applicability and stronger robustness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call