Abstract

This paper presents a novel schema to address the polysemy of visual words in the widely used bag-of-words model. As a visual word may have multiple meanings, we show it is possible to use semantic contexts to disambiguate these meanings and therefore improve the performance of bag-of-words model. On one hand, for an image, multiple context-specific bag-of-words histograms are constructed, each of which corresponds to a semantic context. Then these histograms are merged by selecting only the most discriminative context for each visual word, resulting in a compact image representation. On the other hand, an image is represented by the occurrence probabilities of semantic contexts. Finally, when classifying an image, two image representations are combined at decision level to utilize the complementary information embedded in them. Experiments on three challenging image databases (PASCAL VOC 2007, Scene-15 and MSRCv2) show that our method significantly outperforms state-of-the-art classification methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call