A Fast Stereo Matching Network with Multi-Cross Attention

Ming Wei,Yi Wu,Changji Liu,Jiarong Wang,Jiaqi Sun,Ming Zhu

doi:10.3390/s21186016

Abstract

Stereo matching networks based on deep learning are widely developed and can obtain excellent disparity estimation. We present a new end-to-end fast deep learning stereo matching network in this work that aims to determine the corresponding disparity from two stereo image pairs. We extract the characteristics of the low-resolution feature images using the stacked hourglass structure feature extractor and build a multi-level detailed cost volume. We also use the edge of the left image to guide disparity optimization and sub-sample with the low-resolution data, ensuring excellent accuracy and speed at the same time. Furthermore, we design a multi-cross attention model for binocular stereo matching to improve the matching accuracy and achieve end-to-end disparity regression effectively. We evaluate our network on Scene Flow, KITTI2012, and KITTI2015 datasets, and the experimental results show that the speed and accuracy of our method are excellent.

Highlights

The binocular camera plays a significant role in autonomous driving, target detection, and other fields
The purpose of stereo matching is to find the corresponding pixels from the binocular images [5]
In order to test the effectiveness of our multi-level cost volume, we proposed an ablation study to compare the effects of conventional construction methods and our construction methods on network results to prove our design choice

Summary

Introduction

The binocular camera plays a significant role in autonomous driving, target detection, and other fields. It has a series of advantages such as a much lower price than LIDAR, better performance, and fewer errors [1,2]. We can use the binocular camera to achieve excellent depth estimation from a pair of corrected left and right images. The purpose of stereo matching is to find the corresponding pixels from the binocular images [5]. The pixel point (x, y) is in the left image; the same pixel point is (x − d, y) in the right. The depth D of the pixel is fB/d, where f is the focal length of the camera and B is the baseline distance between the center of two cameras [6,7]

Objectives

Methods

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors	Publication Date: Sep 8, 2021
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Fast Stereo Matching Network with Multi-Cross Attention

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

A stereo matching algorithm based on the improved PSMNet.
Zedong Huang ... Xuefei Yu
PloS one | VOL. 16
Zedong Huang, et. al.Zedong Huang ... Xuefei Yu
19 Aug 2021
PloS one | VOL. 16

Parallax attention stereo matching network based on the improved group-wise correlation stereo network.
Xuefei Yu ... Jyotismita Chaki
PloS one | VOL. 17
Xuefei Yu, et. al.Xuefei Yu ... Jyotismita Chaki
09 Feb 2022
PloS one | VOL. 17

Parallax attention stereo matching network based on the improved group-wise correlation stereo network
Jinan Gu ... Zedong Huang
-
Jinan Gu, et. al.Jinan Gu ... Zedong Huang
09 Feb 2022
09 Feb 2022

3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization
Tsun-Hsuan Wang ... Yi-Hsuan Tsai
-
Tsun-Hsuan Wang, et. al.Tsun-Hsuan Wang ... Yi-Hsuan Tsai
01 Nov 2019
01 Nov 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Fast Stereo Matching Network with Multi-Cross Attention

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors