Abstract

Depth estimation is a classical problem in computer vision, which typically relies on either a depth sensor or stereo matching alone. The depth sensor provides real-time estimates in repetitive and textureless regions where stereo matching is not effective. However, stereo matching can obtain more accurate results in rich texture regions and object boundaries where the depth sensor often fails. We fuse stereo matching and the depth sensor using their complementary characteristics to improve the depth estimation. Here, texture information is incorporated as a constraint to restrict the pixel’s scope of potential disparities and to reduce noise in repetitive and textureless regions. Furthermore, a novel pseudo-two-layer model is used to represent the relationship between disparities in different pixels and segments. It is more robust to luminance variation by treating information obtained from a depth sensor as prior knowledge. Segmentation is viewed as a soft constraint to reduce ambiguities caused by under- or over-segmentation. Compared to the average error rate 3.27% of the previous state-of-the-art methods, our method provides an average error rate of 2.61% on the Middlebury datasets, which shows that our method performs almost 20% better than other “fused” algorithms in the aspect of precision.

Highlights

  • IntroductionDepth estimation is one of the most fundamental and challenging problems in computer vision

  • Depth estimation is one of the most fundamental and challenging problems in computer vision.For decades, it has been important for many advanced applications, such as 3D reconstruction [1], robotic navigation [2], object recognition [3] and free viewpoint television [4]

  • It is clear that our method performs almost 20% better than other “fused” algorithms in the aspect of precision

Read more

Summary

Introduction

Depth estimation is one of the most fundamental and challenging problems in computer vision. It has been important for many advanced applications, such as 3D reconstruction [1], robotic navigation [2], object recognition [3] and free viewpoint television [4]. The goal of passive methods like stereo matching is to estimate a high-resolution dense disparity map by finding corresponding pixels in image sequences [5]. These methods heavily rely on how the scene is presented and contain error matchings caused by the luminance variation.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call