Abstract
The Bag-of-Visual-Words (BoVW) model is widely used for image classification, object recognition and image retrieval problems. In BoVW model, the local features are quantized and 2-D image space is represented in the form of order-less histogram of visual words. The image classification performance suffers due to the order-less representation of image. This paper presents a novel image representation that incorporates the spatial information to the inverted index of BoVW model. The spatial information is added by calculating the global relative spatial orientation of visual words in a rotation invariant manner. For this, we computed the geometric relationship between triplets of identical visual words by calculating an orthogonal vector relative to each point in the triplets of identical visual words. The histogram of visual words is calculated on the basis of the magnitude of these orthogonal vectors. This calculation provides the unique information regarding the relative position of visual words when they are collinear. The proposed image representation is evaluated by using four standard image benchmarks. The experimental results and quantitative comparisons demonstrate that the proposed image representation outperforms the existing state-of-the-art in terms of classification accuracy.
Highlights
One of the most challenging task in computer and robotics vision is to classify images into semantic categories [1]
We proposed a novel approach to incorporate global spatial information by calculating an orthogonal vector relative to each point in the triplets of identical visual words as shown in (Fig 3) and calculating the histogram on the basis of the magnitude of these orthogonal vectors
Though discriminative features are crucial for image classification and have a direct impact on performance, our approach to incorporate the spatial context by modeling relative relationship among triplets of identical visual words provides better results than the more recent featurefusion based approaches
Summary
One of the most challenging task in computer and robotics vision is to classify images into semantic categories [1]. Khan et al [25] made a notable contribution in this domain and incorporated global spatial information in BoVW model by considering the global geometric relationships among the Pairs of Identical Words (PIWs). This article presents a novel way to model global relative spatial orientation of visual words in a rotation invariant manner. For this we computed the geometric relationship between triplets of identical visual words by calculating an orthogonal vector relative to each point in the triplets of identical visual words and calculating the histogram on the basis of the magnitude of these orthogonal vectors. The last section concludes the article and points towards the future directions of research
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.