Abstract

The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms.

Highlights

  • Shape is perceptually the most important visual characteristic of an object

  • We introduce a hierarchical object detection technique which is motivated by the shape selectivity of some neurons in inferotemporal cortex

  • In order to distinguish the two types of filter, we refer to the composite shape-selective filter that we propose in this paper as S-Combination Of Shifted Filter REsponses (COSFIRE) and to the filter proposed in Azzopardi and Petkov (2013b) as V-COSFIRE (S and V stand for shape and vertex, respectively)

Read more

Summary

Introduction

Shape is perceptually the most important visual characteristic of an object. there is no formal definition—as with most perceptual related concepts—it is understood that the twodimensional shape of an object is characterized by the relative spatial positions of a collection of contour-based features.Let us consider, for instance, the square in Figure 1A, which we refer to as a reference or prototype object. Shape is perceptually the most important visual characteristic of an object. There is no formal definition—as with most perceptual related concepts—it is understood that the twodimensional shape of an object is characterized by the relative spatial positions of a collection of contour-based features. Let us consider, for instance, the square, which we refer to as a reference or prototype object. From the point of view of visual perception the incomplete object in Figure 1B is very similar to the prototype even though it is composed of only 25% of the contour pixels of the reference object. The closed polygon, which has the bottom half equivalent to that of the prototype is perceptually less similar to it.

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call