Deep Memory Network for Cross-Modal Retrieval

Ge Song,Dong Wang,Xiaoyang Tan

doi:10.1109/tmm.2018.2877122

Abstract

With the explosive growth of multimedia data on the Internet, cross-modal retrieval has attracted a great deal of attention in both computer vision and multimedia communities. However, this task is challenging due to the heterogeneity gap between different modalities. Current approaches typically involve a common representation learning process that maps data from different modalities into a common space by linear or nonlinear embedding. Yet, most of them only handle the dual-modal situation and generalize poorly to complex cases that involve multiple modalities. In addition, they often require expensive fine-grained alignment of training data among diverse modalities. In this paper, we address these with a novel cross-modal memory network (CMMN), in which memory contents across modalities are simultaneously learned from end to end without the need of exact alignment. We further account for the diversity across multiple modalities using the strategy of adversarial learning. Extensive experimental results on several large-scale datasets demonstrate that the proposed CMMN approach achieves state-of-the-art performance in the task of cross-modal retrieval.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Deep Memory Network for Cross-Modal Retrieval

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: May 1, 2019
Citations: 76

Similar Papers

OTCMR: Bridging Heterogeneity Gap with Optimal Transport for Cross-modal Retrieval
Mingyang Li ... Shao-Lun Huang
-
Mingyang Li, et. al.Mingyang Li ... Shao-Lun Huang
26 Oct 2021
26 Oct 2021

Deep Adversarial Cascaded Hashing for Cross-Modal Vessel Image Retrieval
Jiaen Guo ... Xin Guan
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 16
Jiaen Guo, et. al.Jiaen Guo ... Xin Guan
01 Jan 2023
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 16

Fine-Grained Cross-Modal Semantic Consistency in Natural Conservation Image Data from a Multi-Task Perspective.
Rui Tao ... Meng Zhu
Sensors | VOL. 24
Rui Tao, et. al.Rui Tao ... Meng Zhu
14 May 2024
Sensors | VOL. 24

Deep Supervised Dual Cycle Adversarial Network for Cross-Modal Retrieval
Lei Liao ... Bob Zhang
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 33
Lei Liao, et. al.Lei Liao ... Bob Zhang
01 Feb 2023
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Memory Network for Cross-Modal Retrieval

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia