The Visual Word Booster: A Spatial Layout of Words Descriptor Exploiting Contour Cues.

Xinghui Dong,Junyu Dong

doi:10.1109/tip.2018.2830127

Abstract

Although researchers have made efforts to use the spatial information of visual words to obtain better image representations, none of the studies take contour cues into account. Meanwhile, it has been shown that contour cues are important to the perception of imagery in the literature. Inspired by these studies, we propose to use the Spatial Layout of Words (SLoW) to boost visual word based image descriptors by exploiting contour cues. Essentially, the SLoW descriptor utilises contours and incorporates different types of commonly used visual words, including hand-crafted basic contour elements (referred to as "contons"), textons and Scale-Invariant Feature Transform (SIFT) words, deep convolutional words and a special type of words: LBP (Local Binary Pattern) codes. Moreover, SLoW features are combined with Spatial Pyramid Matching (SPM) or Vector of Locally Aggregated Descriptors (VLAD) features. The SLoW descriptor and its combined versions are tested in different tasks. Our results show that they are superior to, or at least comparable to, their counterparts examined in this study. In particular, the joint use of the SLoW descriptor boosts the performance of the SPM and VLAD descriptors. We attribute these results to the fact that contour cues are important to human visual perception and, the SLoW descriptor captures not only local image characteristics but also the global spatial layout of these characteristics in a more perceptually consistent way than its counterparts.

Full Text