Abstract
In order to bridge the semantic gap exists in image retrieval, this paper propose an approach combining generative and discriminative learning to accomplish the task of automatic image annotation and retrieval. We firstly present continuous probabilistic latent semantic analysis (PLSA) to model continuous quantity. Furthermore, we propose a hybrid framework which employs continuous PLSA to model visual features of images in generative learning stage and uses ensembles of classifier chains to classify the multi-label data in discriminative learning stage. Since the framework combines the advantages of generative and discriminative learning, it can predict semantic annotation precisely for unseen images. Finally, we conduct a series of experiments on a standard Corel dataset. The experiment results show that our approach outperforms many state-of-the-art approaches.
Highlights
As an important research issue, Content-based image retrieval (CBIR) searches relative images of given example in visual level
In order to bridge the semantic gap exists in image retrieval, this paper propose an approach combining generative and discriminative learning to accomplish the task of automatic image annotation and retrieval
The performance of Hybrid Generative/Discriminative Model (HGDM) is compared with some state-of-the-art approaches—the Translation Model [12], cross-media relevance models (CMRM) [9], continuousspace relevance model (CRM) [10], multiple Bernoulli relevance model (MBRM) [11], probabilistic latent semantic analysis (PLSA)-WORDS [15], supervised multiclass labeling (SML) [7], TGLM [21] and multi-label sparse coding (MSC) [8]
Summary
As an important research issue, Content-based image retrieval (CBIR) searches relative images of given example in visual level. The first one is based on discriminative model It defines auto-annotation as a traditional supervised classification problem [4,5,6,7,8], which treats each semantic concept as an independent class and creates different classifiers for different concepts. The second perspective takes a different stand It is based on generative model and treats image and text as equivalent data. It attempts to discover the correlation between visual features and textual words on an unsupervised basis by estimating the joint distribution of features and words. This paper presents continuous PLSA, which assumes that feature vectors of images are governed by a Gaussian distribution under a given latent aspect other than a multinomial one.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have