Abstract

In this chapter, we present an approach to handle multi–modality in image retrieval using a Vector Space Model (VSM), which is extensively used in text retrieval. We simply extended the model with visual terms aiming to close the semantic gap by helping to map low–level features into high level textual semantic concepts. Moreover, this combination of textual and visual modality into one space also helps to query a textual database with visual content, or a visual database with textual content. Alongside this, in order to improve the performance of text retrieval we propose a novel expansion and re–ranking method, applied both to the documents and the query. When textual annotations of images are acquired automatically, they may contain too much information, and document expansion adds more noise to retrieval results. We propose a re–ranking phase to discard such noisy terms. The approaches introduced in this chapter were evaluated in two sub–tasks of ImageCLEF2009. First, we tested the multi–modality part in ImageCLEFmed and obtained the best rank in mixed retrieval, which includes textual and visual modalities. Secondly, we tested expansion and re–ranking methods in ImageCLEFWiki and the results were superior to others and obtained the best four positions in text–only retrieval. The results showed that the handling of multi–modality in text retrieval using a VSM is promising, and document expansion and re–ranking plays an important role in text–based image retrieval.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call