An Enhancement to the Spatial Pyramid Matching for Image Classification and Retrieval

Priyabrata Karmakar,Shyh Wei Teng,Guojun Lu,Dengsheng Zhang

doi:10.1109/access.2020.2969783

Priyabrata Karmakar, Shyh Wei Teng + Show 2 more

Open Access

https://doi.org/10.1109/access.2020.2969783

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 32	License type: CC BY 4.0

Affiliation: Federation University

Abstract

Spatial pyramid matching (SPM) is one of the widely used methods to incorporate spatial information into the image representation. Despite its effectiveness, the traditional SPM is not rotation invariant. A rotation invariant SPM has been proposed in the literature but it has many limitations regarding the effectiveness. In this paper, we investigate how to make SPM robust to rotation by addressing those limitations. In an SPM framework, an image is divided into an increasing number of partitions at different pyramid levels. In this paper, our main focus is on how to partition images in such a way that the resulting structure can deal with image-level rotations. To do that, we investigate three concentric ring partitioning schemes. Apart from image partitioning, another important component of the SPM framework is a weight function. To apportion the contribution of each pyramid level to the final matching between two images, the weight function is needed. In this paper, we propose a new weight function which is suitable for the rotation-invariant SPM structure. Experiments based on image classification and retrieval are performed on five image databases. The detailed result analysis shows that we are successful in enhancing the effectiveness of SPM for image classification and retrieval.

Highlights

Over the last decade, bag of words (BOW) [1] has become one of the most successful image representations to be used in image classification and retrieval tasks
1) IMAGE CLASSIFICATION RESULTS TO COMPARE THE EFFECTIVENESS OF RI-spatial pyramid matching (SPM) WITH THE EXISTING ROTATION INVARIANT SPM (SPR) the performances of RI-SPMs using the three partitioning schemes are compared with the performance of spatial pyramid ring (SPR) [4]
2) IMAGE CLASSIFICATION RESULTS TO COMPARE THE EFFECTIVENESS OF RI-SPMs WITH traditional SPM (TrSPM) AND TO VALIDATE THE EFFECTIVENESS OF generalized weight function (GWF) image classification results are provided to compare how robust RI-SPMs are with respect to the TrSPM

Summary

Introduction

Bag of words (BOW) [1] has become one of the most successful image representations to be used in image classification and retrieval tasks. In BOW, local descriptors, like scale invariant feature transform (SIFT) [2], are extracted from all the images in a database, followed by clustering the local descriptors of training images to obtain a visual word dictionary. BOW is a popular approach, it lacks spatial information. To overcome this issue, spatial pyramid matching (SPM) [3] was proposed. Each grid partition is represented by a histogram of visual words. Concatenated histograms obtained from all the grid partitions of a particular pyramid level is the image-level representation of that level. Level-wise similarity scores are obtained by applying the histogram intersection kernel between the

Objectives

Results

Conclusion