Traditional content-based image retrieval (CBIR) systems often fail to meet a user’s need due to the ‘semantic gap’ between the extracted features of the systems and the user’s query. The cause of the semantic gap is the failure of extracting real semantics from an image and the query. To extract semantics of images, however, is a difficult task. Most existing techniques apply some predefined semantic categories and assign the images to appropriate categories through some learning processes. Nevertheless, these techniques always need human intervention and rely on content-based features. In this paper we propose a novel approach to bridge the semantic gap which is the major deficiency of CBIR systems. We conquer the deficiency by extracting semantics of an image from the environmental texts around it. Since an image generally co-exists with accompanying texts in various formats, we may rely on such environmental texts to discover the semantics of the image. We apply a text mining process, which adopts the self-organizing map (SOM) learning algorithm as a kernel, on the environmental texts of an image to extract the semantic information from this image. Some implicit semantic information of the images can be discovered after the text mining process. We also define a semantic relevance measure to achieve the semantic-based image retrieval task. We performed experiments on a set of images which are collected from web pages and obtained promising results.
Read full abstract