Abstract
It is often assumed that humans generate a 3D reconstruction of the environment, either in egocentric or world-based coordinates, but the steps involved are unknown. Here, we propose two reconstruction-based models, evaluated using data from two tasks in immersive virtual reality. We model the observer’s prediction of landmark location based on standard photogrammetric methods and then combine location predictions to compute likelihood maps of navigation behaviour. In one model, each scene point is treated independently in the reconstruction; in the other, the pertinent variable is the spatial relationship between pairs of points. Participants viewed a simple environment from one location, were transported (virtually) to another part of the scene and were asked to navigate back. Error distributions varied substantially with changes in scene layout; we compared these directly with the likelihood maps to quantify the success of the models. We also measured error distributions when participants manipulated the location of a landmark to match the preceding interval, providing a direct test of the landmark-location stage of the navigation models. Models such as this, which start with scenes and end with a probabilistic prediction of behaviour, are likely to be increasingly useful for understanding 3D vision.
Highlights
Many studies on 3D representation assume that the parietal cortex generates representations of the scene in an egocentric frame, the hippocampus does so in a world-centred frame, and coordinate transformations account for the passage of information from one frame to another (Andersen et al 1997; Burgess et al 1999; Snyder et al 1998; Mou et al 2006; Burgess 2006; O’Keefe and Nadel 1978; McNaughton et al 2006)
In relation to psychophysical data, there have been few attempts to model and test the processes assumed to underlie the generation of a 3D reconstruction from images, including the distortions that would be predicted to arise from such processing, as we do here
Participants viewed a simple scene in immersive virtual reality and were teleported to a different location in the scene from where they had to return to the original location
Summary
Many studies on 3D representation assume that the parietal cortex generates representations of the scene in an egocentric frame, the hippocampus does so in a world-centred frame, and coordinate transformations account for the passage of information from one frame to another (Andersen et al 1997; Burgess et al 1999; Snyder et al 1998; Mou et al 2006; Burgess 2006; O’Keefe and Nadel 1978; McNaughton et al 2006). 3D reconstruction is not the only way that a scene could be represented (Gillner and Mallot 1998; Glennerster et al 2001; Warren 2012) and more generally there are many ways to guide actions and navigate within a 3D environment that do not involve scene reconstruction (Gibson 1979; Franz et al 1998; Möller and Vardy 2006; Stürzl et al 2008). Together, these come under the category of “view-based” methods of carrying out tasks. We attempt to reproduce a similar pattern of errors using two variants of a reconstruction-based algorithm
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have