Abstract

Supervised learning methods have been used to calculate the stereo matching cost in a lot of literature. These methods need to learn parameters from public datasets with ground truth disparity maps. Due to the heavy workload used to label the ground truth disparities, the available training data are limited, making it difficult to apply these supervised learning methods to practical applications. The two-branch convolutional sparse representation (TCSR) model is proposed in the paper. It learns the convolutional filter bank from stereo image pairs in an unsupervised manner, which reduces the redundancy of the convolution kernels. Based on the TCSR model, an unsupervised stereo matching cost (USMC), which does not rely on the truth ground disparity maps, is designed. A feasible iterative algorithm for the TCSR model is also given and its convergence is proven. Experimental results on four popular data sets and one monocular video clip show that the USMC has higher accuracy and good generalization performance.

Highlights

  • Stereo matching, known as disparity mapping, is one of the key techniques in stereo vision research area

  • The fifth term can control the feature map ZLjk extracted from the left image ILj to approximate the feature map ZRjk extracted from the right image IRj under the same convolution kernel, so that the same feature is prevented from being represented by different convolution kernels, which reduces the redundancy of the convolution kernels

  • This paper introduces the two-branch technique into the convolutional sparse representation first time in the paper and builds the two-branch convolutional sparse representation (TCSR) model

Read more

Summary

INTRODUCTION

Known as disparity mapping, is one of the key techniques in stereo vision research area. Cheng et al.: TCSR for Stereo Matching proposed These CNN methods achieved better performance than conventional methods on challenging public benchmark data sets (such as KITTI [20] and Middlebury 2014 [2]). The performance of the CNN based stereo matching costs is restricted by the amount of training data with accurate ground truth disparity maps Those methods trained and tested on the commonly used data sets, such as KITTI and Middlebury 2014, perform poorly in general scenes or real-life images. Based on TCSR model, a new unsupervised stereo matching cost is proposed It can preserve the geometric details of stereo images and produce smooth disparity maps, and achieve very good performance on challenging regions, such as exposure changes and occlusion regions.

TCSR MODEL
ITERATIVE ALGORITHM
1: Initialize
CONVERGENCE ANALYSIS
COMPLEXITY ANALYSIS
STEREO MATCHING
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call