Abstract
In object recognition, Spatial Pyramid Matching (SPM) has been the most popular framework to incorporate spatial information into the bag-of-words model. Dividing each layer of the pyramid into 2l×2l spatial windows, SPM extracts histograms from each and concatenate them to create image representation. SPM offers an approximate spatial arrangement to the previously unordered collection of codeword histogram. This paper presents a detailed investigation on the optimality of the traditional SPM model and simultaneously offers a framework to obtain the most optimal spatial window arrangement from the set of possible spatial windows. Using such model, we are able to consistently achieve significant increase in recognition performance, up to 4.38%. With nearly 40% less memory cost, it shows that the traditional spatial window arrangement of SPM is indeed inefficient. We tested our proposed model using 15 Scene, Caltech 101, Caltech 256, MIT-Indoor, UIUC-Sport, and STL-10.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have