Abstract

Concept-based multimedia search has become more and more popular in multimedia information retrieval (MIR). However, which semantic concepts should be used for data collection and model construction is still an open question. , there is very little research found on automatically choosing multimedia concepts with small semantic gaps. In this paper, we propose a novel framework to develop a lexicon of high-level concepts with small semantic gaps (LCSS) from a large-scale Web image dataset. By defining a confidence map and content-context similarity matrix, images with small semantic gaps are selected and clustered. The final concept lexicon is mined from the surrounding descriptions (titles, categories and comments) of these images. This lexicon offers a set of high-level concepts with small semantic gaps, which is very helpful for people to focus for data collection, annotation and modeling. It also shows a promising application potential for image annotation refinement and rejection. The experimental results demonstrate the validity of the developed concepts lexicon.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call