Abstract
Object classification is a highly important area of computer vision and has many applications including robotics, searching images, face recognition, aiding visually impaired people, censoring images and many more. A new common method of classification that uses features is the Bag of Words approach. In this method a codebook of visual words is created using various clustering methods. For increasing the performance Multiple Dictionaries BoW (MDBoW) method that uses more visual words from different independent dictionaries instead of adding more words to the same dictionary was implemented using hard clustering method. Nearest-neighbor assignments are used in hard clustering of features. A given feature may be nearly the same distance from two cluster centers. For a typical hard clustering method, only the slightly nearer neighbor is selected to represent that feature. Thus, the ambiguous features are not well-represented by the visual vocabulary. To address this problem, soft clustering model based Multiple Dictionary Bag of Visual words for image classification is implemented with dictionary generated using modified Fuzzy C-means algorithm using R1 norm. A performance evaluation on images has been done by varying the dictionary size. The proposed method works better when the number of topics and the number of images per topics are more. The results obtained indicate that multiple dictionary bag of words model using fuzzy clustering increases the recognition performance than the baseline method.
Highlights
One of the most important and challenging problem in machine vision is retrieving images from a large and highly varied image data set based on visual contents
More words are added to the same dictionary whereas in Multiple Dictionaries for BoW (MDBoW) more words are taken from different independent dictionaries
More words are taken from different independent dictionaries where as in base line method more words will be taken from same dictionary
Summary
One of the most important and challenging problem in machine vision is retrieving images from a large and highly varied image data set based on visual contents. Automatic classification of images will be helpful in efficient search and management of these large collections of images. A new method of classification that uses features is the Bag of Words (Lazebnik et al, 2006) approach. This is an idea that solves the problem of recognition with an approach starting from visual features and not from segmentation. The first step in classifying images using Bag of Words is creating a codebook of visual words. For this features are extracted using detectors or dense sampling and descriptors are calculated at each and every local keypoints extracted. Local descriptors such as Haar descriptor (Viola and Jones, 2001), Scale-Invariant Feature Transform (SIFT) descriptor (Lowe, 2004), Histogram of Gradients (HOG) descriptor (Dalal and Triggs, 2005) and Speeded Up Robust Feature descriptor (SURF) (Bay et al, 2006) are commonly used
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have