Creating visual vocabulary based on SIFT descriptor in compressed domain

Lei Sui,Yuncong Yang,Li Zhuo,Jing Zhang

doi:10.1109/wcsp.2011.6096718

Abstract

Recently bag-of-words (BoW) model having been widely used in textual information processing has been extended into many tasks in visual domain such as image classification, scene analysis, image annotation and image retrieval, namely bag-of-visual-words (BoVW) model. Therefore, it is essential to create an effective visual vocabulary. Most of existing approaches create visual vocabularies from image in pixel domain, which requires extra processing time for decompressed images, since most images are stored in compressed format. In this paper we propose to create a visual vocabulary based on Scale Invariant Feature Transform (SIFT) descriptor in compressed domain with the following three steps, (1) constructing low-resolution images in compressed domain; (2) extracting SIFT descriptor from low-resolution images; and (3) creating a visual vocabulary based on extracted SIFT descriptors. In order to evaluate the performance of the visual words, experiments have been conducted on identifying pornographic images. Experimental results indicate that the proposed method can recognize pornographic images accurately with much reduced computational time.

Full Text