Abstract
Perception of stereoscopic depth requires that visual systems solve a correspondence problem: find parts of the left-eye view of the visual scene that correspond to parts of the right-eye view. The standard model of binocular matching implies that similarity of left and right images is computed by inter-ocular correlation. But the left and right images of the same object are normally distorted relative to one another by the binocular projection, in particular when slanted surfaces are viewed from close distance. Correlation often fails to detect correct correspondences between such image parts. We investigate a measure of inter-ocular similarity that takes advantage of spatially invariant computations similar to the computations performed by complex cells in biological visual systems. This measure tolerates distortions of corresponding image parts and yields excellent performance over a much larger range of surface slants than the standard model. The results suggest that, rather than serving as disparity detectors, multiple binocular complex cells take part in the computation of inter-ocular similarity, and that visual systems are likely to postpone commitment to particular binocular disparities until later stages in the visual process.
Highlights
Stereoscopic vision depends on binocular matching: a process that finds which parts of the left and right eye’s images correspond to the same source in the visual scene (Figure 1)
Responses of modeled binocular complex cells to some stimuli are well approximated by a computation similar to inter-ocular correlation (Fleet et al, 1996; Qian and Zhu, 1997; Anzai et al, 1999), and so a simplifying assumption is often made that inter-ocular correlation can be used to predict outcomes of the computation of similarity in biological vision
In the following we propose that the computation of binocular similarity in biological vision should be modeled using an operation which, first, takes advantage of the spatial invariance found in binocular complex cells and, second, avoids the inapt assumption of uniform disparity
Summary
Stereoscopic vision depends on binocular matching: a process that finds which parts of the left and right eye’s images correspond to the same source in the visual scene (Figure 1). Similarity that tolerates distortions of the corresponding parts of left and right images We implement this measure using a MAXpooling operation, which has been successfully used for modeling spatially invariant computations by complex cells in service of other functions of biological vision (Riesenhuber and Poggio, 1999; Serre et al, 2007a,b). Computing correlation of Lki over locations Mjk in the right image, and finding the maximal value, yields the sub-template similarity Sik,j (MAX-pooling operation, Equation 3). Patch at i is a solution to the correspondence problem: COMPUTATION OF DISPARITY In both rigid and flexible methods, inter-ocular correspondences are found by computing similarity (S) between multiple parts of the left and right images of the scene (Figures 1, 2). The larger the slope, the stronger the inter-ocular dissimilarity, and so a larger template flexibility is needed to attain accurate binocular matching
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have