Abstract

In this work we study how we can use a novel model of spatial saliency (visual attention) combined with image features to significantly accelerate a scene recognition application and, at the same time, preserve recognition performance. To do so, we use a mobile robotlike application where scene recognition is carried out through the use of image features to characterize the different scenarios, and the Nearest Neighbor rule to carry out the classification. SIFT and SURF are two recent and competitive alternatives to image local featuring that we compare through extensive experimental work. Results from the experiments show that SIFT features perform significantly better than SURF features achieving important reductions in the size of the database of prototypes without significant losses in recognition performance, and thus, accelerating scene recognition. Also, from the experiments it is concluded that SURF features are less distinctive when using very large databases of interest points, as it occurs in the present case. Visual attention is the process by which the Human Visual System (HVS) is able to select from a given scene regions of interest that contain salient information, and thus, reduce the amount of information to be processed (Treisman, 1980; Koch, 1985). In the last decade, several computational models biologically motivated have been released to implement visual attention in image and video processing (Itti, 2000; Garcia-Diaz, 2008). Visual attention has also been used to improve object recognition and scene analysis (Bonaiuto, 2005; Walther, 2005). In this chapter, we study the utility of using a novel model of spatial saliency to improve a scene recognition application by reducing the amount of prototypes needed to carry out the classification task. The application is based on mobile robot-like video sequences taken in indoor facilities formed by several rooms and halls. The aim is to recognize the different scenarios in order to provide the mobile robot system with general location data. The visual attention approach is a novel model of bottom-up saliency that uses local phase information of the input data where the statistic information of second order is deleted to achieve a Retinoptical map of saliency. The proposed approach joints computational mechanisms of the two hypotheses largely accepted in early vision: first, the efficient coding

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call