Abstract

A view-based, high-dimensional feature-space recognition system called SEEMORE was developed as a testbed to explore the representational trade-offs that arise when a simple feedforward neural architecture is challenged with a difficult 3D object recognition problem. Particular emphasis was placed on designing an object representation that could: 1) cope with a large number of real 3D objects of many different types; 2) operate directly on input images without shift, scale, or other object pre-normalization steps; 3) integrate multiple visual cues; and 4) recognize objects over 6 degrees of freedom of viewpoint, gross non-rigid shape distortions, and/or partial occulsion. Recognition results were obtained using a set of 102 color and shape feature channels, each designed to be invariant to image plane shifts and rotations, and only modestly sensitive to orientation in depth. In response to a test set of 600 novel test views of 100 objects presented individually in color video images, SEEMORE identified the object correctly 97% of the time using a nearest neighbour classifier. Similar levels of performance were obtained for the subset of 15 non-rigid objects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call