Abstract

In this paper, we propose a new dataset for outdoor depth estimation from single and stereo RGB images. The dataset was acquired from the point of view of a pedestrian. Currently, the most novel approaches take advantage of deep learning-based techniques, which have proven to outperform traditional state-of-the-art computer vision methods. Nonetheless, these methods require large amounts of reliable ground-truth data. Despite there already existing several datasets that could be used for depth estimation, almost none of them are outdoor-oriented from an egocentric point of view. Our dataset introduces a large number of high-definition pairs of color frames and corresponding depth maps from a human perspective. In addition, the proposed dataset also features human interaction and great variability of data, as shown in this work.

Highlights

  • Background & SummaryThe use of supervised deep learning algorithms has created the need for massive amounts of information to achieve their great generalization capabilities

  • Stereo triangulation, structure from motion and depth estimation from monocular frames are problems that can be benchmarked with UASOL

  • Alongside the depth maps produced by the ZED Semi-Global Matching (SGBM) algorithm, as a future work, we provide the depth maps computed by a novel deep-learning based stereo method named GC-Net[14]

Read more

Summary

Background & Summary

The use of supervised deep learning algorithms has created the need for massive amounts of information to achieve their great generalization capabilities. The KITTI dataset[5] provides the RGB (stereo pair) and depth maps of 400 different layouts having a total of 1.6 k frames of roads from the city of Karlsruhe (Germany) This dataset is outdoor, so it fulfills one of our main requisites. The different scenes provide human interaction and different types of paths and roads a pedestrian could use It only contains 534 frames, so the scale of this dataset is the smallest of all those reviewed. The dataset presented in this paper is UASOL 8: A Large-scale High-resolution Outdoor Stereo Dataset It was created at the University of Alicante and consists of an RGB-D stereo dataset, which provides 33 different scenes, each with between 2 k and 10 k frames. These features lend the dataset high variability, which will challenge the generalization capabilities of the algorithms (see Technical Validation section)

Methods
Code Availability
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.