Abstract

The spatial pooling method such as spatial pyramid matching (SPM) is very crucial in the bag of features model used in image classification. SPM partitions the image into a set of regular grids and assumes that the spatial layout of all visual words obey the uniform distribution over these regular grids. However, in practice, we consider that different visual words should obey different spatial layout distributions. To improve SPM, we develop a novel spatial pooling method, namely spatial distribution pooling (SDP). The proposed SDP method uses an extension model of Gauss mixture model to estimate the spatial layout distributions of the visual vocabulary. For each visual word type, SDP can generate a set of flexible grids rather than the regular grids from the traditional SPM. Furthermore, we can compute the grid weights for visual word tokens according to their spatial coordinates. The experimental results demonstrate that SDP outperforms the traditional spatial pooling methods, and is competitive with the state-of-the-art classification accuracy on several challenging image datasets.

Highlights

  • Image classification plays a significant role in the computer vision research

  • Empirical results show that spatial pyramid matching (SPM) can significantly improve the classification performance, it assumes that the spatial layout of all visual words obey the uniform distribution over these regular grids

  • We develop a novel spatial distribution pooling (SDP) algorithm to improve the spatial pooling in the bag of words (BoW) model for image classification

Read more

Summary

Introduction

Image classification plays a significant role in the computer vision research. The recent stateof-the-art image classification pipeline consists of two major parts: 1) the image representation, e.g., bag of features (BoF) [1,2,3] and spatial pyramid matching (SPM) [4]; 2) the classifier, e.g., support vector machines (SVMs) and its variants [5, 6]. Empirical results show that SPM can significantly improve the classification performance, it assumes that the spatial layout of all visual words obey the uniform distribution over these regular grids. SPM rigidly partitions the image into several regular grids, and assumes that the spatial layout of all visual words obey the uniform distribution over these grids. Each visual word in SDP occurs in the regular grids in each level following equal probability This generates a conflict to the intuition that different visual words should obey different spatial layout distributions. Under e-GMM, SDP can assign each visual word to a latent grid according to its spatial coordinate, instead of a regular grid. The inferential problem is to compute the posterior distribution of the grid assignment given a visual word v with spatial coordinate c!v

Related work
Experiments with Parameters
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.