Abstract

Binocular stereopsis is the ability of a visual system, belonging to a live being or a machine, to interpret the different visual information deriving from two eyes/cameras for depth perception. From this perspective, the ground-truth information about three-dimensional visual space, which is hardly available, is an ideal tool both for evaluating human performance and for benchmarking machine vision algorithms. In the present work, we implemented a rendering methodology in which the camera pose mimics realistic eye pose for a fixating observer, thus including convergent eye geometry and cyclotorsion. The virtual environment we developed relies on highly accurate 3D virtual models, and its full controllability allows us to obtain the stereoscopic pairs together with the ground-truth depth and camera pose information. We thus created a stereoscopic dataset: GENUA PESTO—GENoa hUman Active fixation database: PEripersonal space STereoscopic images and grOund truth disparity. The dataset aims to provide a unified framework useful for a number of problems relevant to human and computer vision, from scene exploration and eye movement studies to 3D scene reconstruction.

Highlights

  • Stereopsis is commonly dealt with as a static problem, because the disparity map obtained by a fixedgeometry stereo camera pair with parallel axes, is sufficient to reconstruct the 3D spatial layout of the observed scene

  • A systematic collection of stereoscopic image pairs under vergent geometry, with ground-truth depth/disparity information, would be an ideal tool to characterize the problem of purposeful 3D vision

  • Since no databases of images with ground-truth vector disparity are available, the proposed database is unique in its kind

Read more

Summary

Background & Summary

Stereopsis is commonly dealt with as a static problem, because the disparity map obtained by a fixedgeometry stereo camera pair with parallel axes, is sufficient to reconstruct the 3D spatial layout of the observed scene. A systematic collection of stereoscopic image pairs under vergent geometry, with ground-truth depth/disparity information, would be an ideal tool to characterize the problem of purposeful 3D vision These kinds of datasets are, rare or nearly absent. Few provide stereoscopic images with disparity data: e.g., see Middelbury[9], IMPART10, KITTI11,12, and SYNS13 datasets They mainly follow a standard machine vision approach, i.e., with parallel optical axes for the two cameras (off-axis technique). Since no databases of images with ground-truth vector disparity are available, the proposed database is unique in its kind Exemplifying, it allows deriving quantitative performance indexes for horizontal and vertical disparity estimation algorithms, both on a pixel and a local basis[9,68,69,70]. Summarizing, the present dataset of stereoscopic images provides a unified framework useful for many problems relevant to human and computer vision, from visual exploration and attention to eye movement studies, from perspective geometry to depth reconstruction

Methods
Data Records
Technical Validation
Author Contributions
Additional Information
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.