We show that the classical problem of three-dimensional (3D) size perception in obliquely viewed pictures can be understood by comparing human performance to the optimal geometric solution. A photograph seen from the camera position, can form the same retinal projection as the physical 3D scene, but retinal projections of sizes and shapes are distorted in oblique viewing. For real scenes, we previously showed that size and shape inconstancy result despite observers using the correct geometric back-transform, because some retinal images evoke misestimates of object slant or viewing elevation. Now, we examine how observers estimate 3D sizes in oblique views of pictures of objects lying on the ground in different poses. Compared to estimates for real scenes, in oblique views of pictures, sizes were seriously underestimated for objects at frontoparallel poses, but there was almost no change for objects perceived as pointing toward the viewer. The inverse of the function relating projected length to pose, camera elevation and viewing azimuth, gives the optimal correction factor for inferring correct 3D lengths if the elevation and azimuth are estimated accurately. Empirical correction functions had similar shapes to optimal, but lower amplitude. Measurements revealed that observers systematically underestimated viewing azimuth, similar to the frontoparallel bias for object pose perception. A model that adds underestimation of viewing azimuth to the geometrical back-transform, provided good fits to estimated 3D lengths from oblique views. These results add to accumulating evidence that observers use internalized projective geometry to perceive sizes, shapes, and poses in 3D scenes and their pictures.
Read full abstract