Abstract

The Bag-of-Visual-Words (BoVW) representation has been widely used to approach a number of different high-level computer vision tasks. The idea behind the BoVW representation is similar to the Bag-of-Words (BoW) used in Natural Language Processing (NLP) tasks: to extract features from the dataset, then build feature histograms that represent each instance. Although the approach is simple and effective facilitating its applicability to a wide range of problems, it inherits a well-known limitation from the traditional BoW; the disregarding of spatial information among extracted features (sequential information in text), which could be useful to capture discriminative visual-patterns. In this paper, we alleviate this limitation with the joint use of visual words and multi-directional sequences of visual words (visual n-grams). The contribution of this paper is twofold: (i) to build new simple-effective visual features inspired in the popular idea of n-gram representations in NLP and (ii) to propose the Multiple Kernel Learning (MKL) strategies to better exploit the joint use of visual words and visual n-grams in Image Classification (IC) tasks. For the former, we propose building a codebook of visual n-grams, and use them as attributes to represent images by means of the BoVW representation. For the second point, we consider the visual words and visual n-grams as different feature spaces, then we propose MKL strategies to better integrate the visual information. We evaluate our proposal in the image classification task using five different datasets: Histopathology, Birds, Butterflies, Scenes and a subset of 6 classes of CalTech-101. Experimental results show that the proposed strategies exploiting our visual n-grams, outperforms or is competitive with (i) the traditional BoVW, (ii) the BoVW using visual n-grams under traditional fusion schemes (e.g., ensemble based classifiers) and (iii) other approaches in the literature for IC that consider the spatial context.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.