A stereo matching algorithm based on the improved PSMNet.

Zedong Huang,Xuefei Yu,Jinan Gu,Jing Li

doi:10.1371/journal.pone.0251657

Zedong Huang, Xuefei Yu + Show 2 more

Open Access

PDF Available

https://doi.org/10.1371/journal.pone.0251657

Copy DOI

Export

Save

Cite

Journal: PloS one	Publication Date: Aug 19, 2021
Citations: 8	License type: CC BY 4.0

Affiliation: Jiangsu University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Deep learning based on a convolutional neural network (CNN) has been successfully applied to stereo matching. Compared with the traditional method, the speed and accuracy of this method have been greatly improved. However, the existing stereo matching framework based on a CNN often encounters two problems. First, the existing stereo matching network has many parameters, which leads to the matching running time being too long. Second, the disparity estimation is inadequate in some regions where reflections, repeated textures, and fine structures may lead to ill-posed problems. Through the lightweight improvement of the PSMNet (Pyramid Stereo Matching Network) model, the common matching effect of ill-conditioned areas such as repeated texture areas and weak texture areas is solved. In the feature extraction part, ResNeXt is introduced to learn unitary feature extraction, and the ASPP (Atrous Spatial Pyramid Pooling) module is trained to extract multiscale spatial feature information. The feature fusion module is designed to effectively fuse the feature information of different scales to construct the matching cost volume. The improved 3D CNN uses the stacked encoding and decoding structure to further regularize the matching cost volume and obtain the corresponding relationship between feature points under different parallax conditions. Finally, the disparity map is obtained by a regression. We evaluate our method on the Scene Flow, KITTI 2012, and KITTI 2015 stereo datasets. The experiments show that the proposed stereo matching network achieves a comparable prediction accuracy and much faster running speed compared with PSMNet.

Highlights

Stereo matching is the process of calculating the corresponding point deviation from stereo color image pairs to obtain a dense disparity map
We propose an improved Pyramid Stereo Matching Network (PSMNet) [13] algorithm to estimate various scenes’ disparities
Compared with the 3D aggregation network of PSMNet, we have several vital modifications to improve the performance and increase the inference speed, and the details of the structure are shown in Fig 3 and Table 1

Summary

Introduction

Stereo matching is the process of calculating the corresponding point deviation from stereo color image pairs to obtain a dense disparity map. It is widely used in automatic driving, 3D reconstruction, robot navigation, and other fields. As the stereo vision system’s core technology, stereo matching accuracy determines the performance of the entire system. Due to the presence of noise, repeated textures, low textures, occlusion, and other ill-conditioned areas and the lighting conditions, how to obtain an accurate disparity map efficiently and quickly is still a considerable challenge. As shown in the green box, the matching effect of repeated texture regions such as windows and roads is low.

Methods

Results

Discussion

Conclusion