Abstract
Binocular stereopsis is the ability of a visual system, belonging to a live being or a machine, to interpret the different visual information deriving from two eyes/cameras for depth perception. From this perspective, the ground-truth information about three-dimensional visual space, which is hardly available, is an ideal tool both for evaluating human performance and for benchmarking machine vision algorithms. In the present work, we implemented a rendering methodology in which the camera pose mimics realistic eye pose for a fixating observer, thus including convergent eye geometry and cyclotorsion. The virtual environment we developed relies on highly accurate 3D virtual models, and its full controllability allows us to obtain the stereoscopic pairs together with the ground-truth depth and camera pose information. We thus created a stereoscopic dataset: GENUA PESTO—GENoa hUman Active fixation database: PEripersonal space STereoscopic images and grOund truth disparity. The dataset aims to provide a unified framework useful for a number of problems relevant to human and computer vision, from scene exploration and eye movement studies to 3D scene reconstruction.
Highlights
Stereopsis is commonly dealt with as a static problem, because the disparity map obtained by a fixedgeometry stereo camera pair with parallel axes, is sufficient to reconstruct the 3D spatial layout of the observed scene
A systematic collection of stereoscopic image pairs under vergent geometry, with ground-truth depth/disparity information, would be an ideal tool to characterize the problem of purposeful 3D vision
Since no databases of images with ground-truth vector disparity are available, the proposed database is unique in its kind
Summary
Stereopsis is commonly dealt with as a static problem, because the disparity map obtained by a fixedgeometry stereo camera pair with parallel axes, is sufficient to reconstruct the 3D spatial layout of the observed scene. A systematic collection of stereoscopic image pairs under vergent geometry, with ground-truth depth/disparity information, would be an ideal tool to characterize the problem of purposeful 3D vision These kinds of datasets are, rare or nearly absent. Few provide stereoscopic images with disparity data: e.g., see Middelbury[9], IMPART10, KITTI11,12, and SYNS13 datasets They mainly follow a standard machine vision approach, i.e., with parallel optical axes for the two cameras (off-axis technique). Since no databases of images with ground-truth vector disparity are available, the proposed database is unique in its kind Exemplifying, it allows deriving quantitative performance indexes for horizontal and vertical disparity estimation algorithms, both on a pixel and a local basis[9,68,69,70]. Summarizing, the present dataset of stereoscopic images provides a unified framework useful for many problems relevant to human and computer vision, from visual exploration and attention to eye movement studies, from perspective geometry to depth reconstruction
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.