Abstract

A model-building approach to three-dimensional scene interpretation is described. It is represented as a set of programs called the EYE system and its emphasis is on the integration of depth and semantic information to form a three-dimensional model of a scene. The EYE system transforms an image into a model of a scene by successively converting it into region descriptions, surface representations, and finally, three-dimensional models of the objects in a scene. Depth cues provide relative and absolute depth and orientation estimates for regions. An iterative algorithm forms a set of surface descriptions which are globally consistent using the relative depth estimates to constrain the absolute depth estimates. Surfaces are grouped into distinct instances of objects using “long-term” object models represented with a semantic net. Results are presented for near views of color outdoor scenes. The central thesis of the EYE system is that there are several different types of depth cues and that while they can be computed independently they can be meaningfully interpreted only by combining the information from different depth cues under the guidance of semantic information about the objects in a scene.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call