Abstract

AbstractIt has been long recognized that a key obstacle to achieving human-level object recognition performance is the problem of invariance. The human visual system excels at factoring out the image transformations that distort object appearance under natural conditions. Models with a cortex-inspired architecture such as HMAX as well as nonbiological convolutional neural networks are invariant to translation (and in some cases scaling) by virtue of their wiring. The transformations to which this approach has been applied so far are generic; a single example image of any object contains all the information needed to synthesize a new image of the transformed object. In contrast, viewpoint and illumination transformations depend on the object's 3D structure and material properties. These are normally consistent within, but not between, classes.Class-specific modifications of the HMAX model achieve good viewpoint and illumination tolerant performance in a one-shot identification task. Performance suffers when a model that is specialized for transformations of one class is tested on identification within a different class. In fact, viewpoint-pooling models employing templates from the wrong class perform worse on viewpoint invariant identification tasks than models that have no particular mechanisms for dealing with viewpoint at all. The same situation arises for illumination invariance. This is in stark contrast to the generic case where the model is invariant to all classes undergoing the transformation no matter what templates are used.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.