An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval

Lei Zhu,Chengyuan Zhang,Longzhi Sun,Xinpan Yuan,Jun Long,Weiren Yu

doi:10.1109/access.2019.2940055

Abstract

Due to the rapid development of mobile Internet techniques, cloud computation and popularity of online social networking and location-based services, massive amount of multimedia data with geographical information is generated and uploaded to the Internet. In this paper, we propose a novel type of cross-modal multimedia retrieval called geo-multimedia cross-modal retrieval which aims to search out a set of geo-multimedia objects based on geographical distance proximity and semantic similarity between different modalities. Previous studies for cross-modal retrieval and spatial keyword search cannot address this problem effectively because they do not consider multimedia data with geo-tags and do not focus on this type of query. In order to address this problem efficiently, we present the definition of $k$NN geo-multimedia cross-modal query at the first time and introduce relevant conceptions such as cross-modal semantic representation space. To bridge the semantic gap between different modalities, we propose a method named cross-modal semantic matching which contains two important component, i.e., CorrProj and LogsTran, which aims to construct a common semantic representation space for cross-modal semantic similarity measurement. Besides, we designed a framework based on deep learning techniques to implement common semantic representation space construction. In addition, a novel hybrid indexing structure named GMR-Tree combining geo-multimedia data and R-Tree is presented and a efficient $k$NN search algorithm called $k$GMCMS is designed. Comprehensive experimental evaluation on real and synthetic dataset clearly demonstrates that our solution outperforms the-state-of-the-art methods.

Highlights

Due to the rapid popularity of mobile Internet techniques, online social networking and location-based services, massive amount of multimedia data is generated and uploaded to the Internet
To solve the problem of geo-multimedia cross-modal retrieval, we introduce a novel framework that consists of multi-modal feature extraction, cross-modal semantic space mapping, geo-multimedia spatial index and cross-modal semantic similarity measurement
RELATED WORK we introduce an overview of previous works of multi-modal and cross-modal retrieval, deep learning based multimedia retrieval and spatial textual search, which are related to this work

Summary

INTRODUCTION

Due to the rapid popularity of mobile Internet techniques, online social networking and location-based services, massive amount of multimedia data is generated and uploaded to the Internet. Previous studies of traditional multi-modal and cross-modal retrieval do not consider the geo-multimedia data These existing methods cannot improve the retrieval performance by using spatial information. A novel framework of geo-multimedia cross-modal retrieval is presented, which is based on deep learning and spatial indexing techniques. To improve the search performance, we present a novel hybrid indexing structure named GMR-Tree which is a combination of signature technique, multi-modal semantic representations and R-Tree Based on it we develop a novel search algorithm named kGMCMS to boost the retrieval.

RELATED WORK

CROSS-MODAL SEMANTIC REPRESENTATION SPACE

THE FRAMEWORK

CROSS-MODAL SEMANTIC MATCHING

CROSS-MODAL SEMANTIC REPRESENTATION SPACE LEARNING

K NN GEO-MULTIMEDIA CROSS-MODAL SEARCH ALGORITHM

VIII. CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Dual Adversarial Graph Neural Networks for Multi-label Cross-modal Retrieval
Changsheng Xu ... Huaiwen Zhang
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35
Changsheng Xu, et. al.Changsheng Xu ... Huaiwen Zhang
18 May 2021
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35

Representation separation adversarial networks for cross-modal retrieval
Heping Song ... Xing Xu
Wireless Networks | VOL. -
Heping Song, et. al.Heping Song ... Xing Xu
05 Jun 2020
Wireless Networks | VOL. -

Multi-label adversarial fine-grained cross-modal retrieval
Li Liu ... Chunpu Sun
Signal Processing: Image Communication | VOL. 117
Li Liu, et. al.Li Liu ... Chunpu Sun
07 Jul 2023
Signal Processing: Image Communication | VOL. 117

Integrating Multi-Label Contrastive Learning With Dual Adversarial Graph Neural Networks for Cross-Modal Retrieval.
Quan Fang ... Changsheng Xu
IEEE transactions on pattern analysis and machine intelligence | VOL. 45
Quan Fang, et. al.Quan Fang ... Changsheng Xu
01 Jan 2021
IEEE transactions on pattern analysis and machine intelligence | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access