Image Classification Model Using Visual Bag of Semantic Words

Yali Qi,Yeli Li,Guoshan Zhang

doi:10.1134/s1054661819030222

Abstract

In the image classification field, the visual bag of words (BoW) has two drawbacks. One is low classification accuracy because a visual BoW is typically extracted from local low-level visual feature vectors via key points, without considering the high-level semantics of an image. The other is excessive time consumption because the size of the vocabulary is very large, especially for images with explicit backgrounds and object content. To solve these two problems, we propose a novel image classification model based on a visual bag of semantic words (BoSW), which includes an automatic segmentation algorithm based on graph cuts to extract major semantic regions and a semantic annotation algorithm based on support vector machine to label the regions with a visual semantic vocabulary. The proposed BoSW model refines image semantics by introducing user conceptions for extracting semantic vocabularies and reducing the size of the vocabulary. Experimental results demonstrate the superiority of the proposed algorithm through comparisons with state-of-the-art methods on benchmark datasets.

Full Text