Abstract

Recently, the bag-of-words approach has been successfully applied to automatic image annotation, object recognition, etc. The method needs to first quantize an image using the visual terms and then extract the image-level statistics for classification. Although successful applications have been reported, it lacks the capability to model the spatial dependency and the correspondence between the patches and visual parts. Moreover, quantization deteriorates the descriptive power of patch feature. This paper proposes the hidden maximum entropy (HME) approach for modeling visual concepts. Each concept is composed of a set of visual parts, each part having a Gaussian distribution. The spatial dependency and image-level statistics of parts are modeled through the maximum entropy. The model is learned using the developed EM-IIS algorithm. We report the preliminary results on the 260 concepts in the Corel dataset and compared with the maximum entropy (ME) approach. Our experiments on concept detection show that (1) a relative increment of 10.3% is observed when comparing the average AUC value of HME approach with that of the ME approach and (2) the HME approach reduces the average equal error rate from 0.412 for the ME approach to 0.354.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call