Perceptual organization and visual recognition: David G. Lowe. Kluwer Academic, Boston, Mass., 1985. xi + 162 pp. $29.95

David J Lowe

doi:10.1016/0734-189x(85)90041-6

Abstract

A computational model is presented for the visual recognition of three-dimensional objects based upon their spatial correspondence with two-dimensional features in an image. A number of components of this model are developed in further detail and implemented as computer algorithms. At the highest level, a verification process has been developed which can determine exact values of viewpoint and object parameters from hypothesized matches between three-dimensional object features and two-dimensional image features. This provides a reliable quantitative procedure for evaluating the correctness of an interpretation, even in the presence of noise or occlusion. Given a reliable method for final evaluation of correspondence, the remaining components of the system are aimed at reducing the size of the search space which must be covered. Unlike many previous approaches, this recognition process does not assume that it is possible to directly derive depth information from the image. Instead, the primary descriptive component is a process of perceptual organization, in which spatial relations are detected directly among two-dimensional image features. A basic requirement of the recognition process is that perceptual organization should accurately distinguish meaningful groupings from those which arise by accident of viewpoint or position. This requirement is used to derive a number of further constraints which must be satisfied by algorithms for perceptual grouping. A specific algorithm is presented for the problem of segmenting curves into natural descriptions. Methods are also presented for using the viewpoint-invariance properties of the perceptual groupings to infer three-dimensional relations directly from the image. The search process itself is described, both for covering the range of possible viewpoints and the range of possible objects. A method is presented for using evidential reasoning to combine information from multiple sources to determine the most efficient ordering for the search. This use of evidential reasoning allows a system to automatically improve its performance as it gains visual experience. In summary, spatial organization and recognition are shown to be a practical basis for current systems and to provide a promising path for further development of improved visual capabilities.

Full Text