Real-time Hybrid Stereo Vision System for HD Resolution Disparity Map

Jiho Chang,Jae-Chan Jeong

doi:10.5244/c.28.89

Abstract

Stereo matching is a traditional method used to obtain 3D depth information and has been studied for decades. However, it is still difficult to apply stereo matching algorithms to practical devices due to real-time issues as well as the technique’s inability to adequately handle untextured regions. In this paper, we propose a hybrid stereo matching system to remedy the disadvantages of active and passive stereo vision. Stereo matching algorithm Following Scharstein’s taxonomy [5], the stereo matching algorithm divides into four steps: matching cost computation, cost aggregation, disparity computation and disparity refinement. First of all, we calculate raw cost volume using the AD-Census [2, 6]. At this time, we use cost combining with alpha-blending for the ADCensus, and the final raw cost that is the sum of the pattern cost (T1) and the non-pattern cost (T2). The information permeability filtering(PF) proposed by Cevahir Cigla et al. [2] is an ASW approach that has simple parameters and provides constant operational time for calculating cost aggregation. However, because there is no proximity weight term, PF can encounter problems with images containing large untextured regions. Modified information permeability(MPF), including a proximity weight term, is defined in Section 2.2. Our proposed system uses WTA as disparity computation, because it is very simple algorithm. Stereo vision system design The proposed system is composed of the stereo head and the stereo emulator. The stereo head includes a LVDS module, an LD/LED projection module and two CMOS sensors. Input from the two CMOS sensors is received in the form of 10-bit monochrome images at a resolution of 1280 x 720 pixels at a rate of 60fps. Image streams from the left & right cameras are transferred to the FPGA board through the LVDS module, which includes control signals. The stereo emulator is based on FPGA modules to obtain disparity maps from stereo pairs. The stereo emulator contains a deserialized module, a USB 3.0 controller and four FPGAs. The USB 3.0 controller transfers the results to the computer and is received parameters for FPGA. Figure 1 (a) represents our entire system. The hybrid system captures a pair of pattern images and an alternating pair of non-pattern images to evaluate disparity. The CMOS sensors are synchronized with LD/LED projection module, so that the images are obtained in operating time with the LD (pattern) in one frame and with the LED (non-pattern) in the next frame. Implemeting stereo algorithm Figure 1 (b) is a block diagram of the hybrid stereo vision system. The stereo matching algorithm consists of four processing elements implemented in each single FPGA. The PrePE is first composed through rectification(based on the Caltech method) and image filtering(based on bilateral filtering). The MPE Left generates a left-referenced disparity that is implemented using the algorithm described in Section 2. Certain modifications are taken into account while implementing the system on the FPGA, and usually appear in the cost aggregation. One of the issues is whether the operations can be processed in a limited clock cycle. In order to solve this problem, we allow as much parallel processing, pipeline insertion and using LUT. Also, we configure to a power of two to use the data shifter instead of the divider. Another issue is whether to use the PF 4way in cost aggregation. When implement-

Full Text