Abstract

In this paper, a novel framework for rich-media object retrieval is described. The searchable items are media representations consisting of multiple modalities, such as 2-D images, 3-D objects and audio files, which share a common semantic concept. The proposed method utilizes the low-level descriptors of each separate modality to construct a new low-dimensional feature space, where all media objects can be mapped irrespective of their constituting modalities. While most of the existing state-of-the-art approaches support queries of one single modality at a time, the proposed one allows querying with multiple modalities simultaneously, through efficient multimodal query formulation, and retrieves multimodal results of any available type. Finally, a multimedia indexing scheme is adopted to tackle the problem of large scale media retrieval. The present framework proposes significant advances over existing methods and can be easily extended to involve as many heterogeneous modalities as possible. Experiments performed on two multimodal datasets demonstrate the effectiveness of the proposed method in multimodal search and retrieval.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.