Abstract

As the consequence of semantic gap, visual similarity does not guarantee semantic similarity, which in general is conflicting with the inherent assumption of many generative-based image annotation methods. While discriminative learning approach had often been used to classify images into different semantic classes, its efficiency is often impaired by the problems of multi-labeling and large scale concept space typically encountered in practical image annotation tasks. In this paper, we explore solutions to the problems of large scale concept space learning and mismatch between semantic and visual space. To tackle the first problem, we explore the use of higher level semantic space with lower dimension by clustering correlated keywords into topics in the local neighborhood. The topics are used as lexis for assigning multiple labels for unlabeled images. To tackle the problem of semantic gap, we aim to reduce the bias between visual and semantic spaces by finding optimal margins in both spaces. In particular, we propose an iterative solution by alternately maximizing the sum of the margins to reduce the gap between visual similarity and semantic similarity. The experimental results on the ECCV2002 benchmark show that our method outperforms the state-of-the-art generative-based annotation method MBRM and discriminative-based ASVM-MIL by 9% and 11% in terms of F1 measure respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.