Abstract

Due to the difficulty of automatically mapping visual features with semantic descriptors, state-of-the-art frameworks have exhibited poor performance in terms of coverage and effectiveness for indexing the visual content. This prompted us to investigate the use of both the Web as a large information source from where to extract relevant contextual linguistic information and bimodal visual-textual indexing as a technique to enrich the vocabulary of index concepts. Our proposal is based on the Signal/Semantic approach for multimedia indexing which generates multi-facetted conceptual representations of the visual content. We propose to enrich these image representations with concepts automatically extracted from the visual contextual information. We specifically target the integration of semantic concepts which are more specific than the initial index concepts since they represent the visual content with greater accuracy and precision. Also, we aim to correct the faulty indexes resulting from the automatic semantic tagging. Experimentally, the details of the prototyping are given and the presented technique is tested in a Web-scale evaluation on 30 queries representing elaborate image scenes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.