Dual Compositional Learning in Interactive Image Retrieval

Jongseok Kim,Youngjae Yu,Gunhee Kim,Hoeseong Kim

doi:10.1609/aaai.v35i2.16271

Abstract

We present an approach named Dual Composition Network (DCNet) for interactive image retrieval that searches for the best target image for a natural language query and a reference image. To accomplish this task, existing methods have focused on learning a composite representation of the reference image and the text query to be as close to the embedding of the target image as possible. We refer this approach as Composition Network. In this work, we propose to close the loop with Correction Network that models the difference between the reference and target image in the embedding space and matches it with the embedding of the text query. That is, we consider two cyclic directional mappings for triplets of (reference image, text query, target image) by using both Composition Network and Correction Network. We also propose a joint training loss that can further improve the robustness of multimodal representation learning. We evaluate the proposed model on three benchmark datasets for multimodal retrieval: Fashion-IQ, Shoes, and Fashion200K. Our experiments show that our DCNet achieves new state-of-the-art performance on all three datasets, and the addition of Correction Network consistently improves multiple existing methods that are solely based on Composition Network. Moreover, an ensemble of our model won the first place in Fashion-IQ 2020 challenge held in a CVPR 2020 workshop.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dual Compositional Learning in Interactive Image Retrieval

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: May 18, 2021
Citations: 25

Similar Papers

<title>Simulated annealing optimization in chamfer matching</title>
Terence K L Goh ... David P Casasent
-
Terence K L Goh, et. al.Terence K L Goh ... David P Casasent
29 Oct 1996
29 Oct 1996

Rethinking the Reference-based Distinctive Image Captioning
Yangjun Mao ... Zhimeng Zhang
-
Yangjun Mao, et. al.Yangjun Mao ... Zhimeng Zhang
10 Oct 2022
10 Oct 2022

Cloud and cloud shadow removal of landsat 8 images using Multitemporal Cloud Removal method
Danang Surya Candra ... Peter Scarth
-
Danang Surya Candra, et. al.Danang Surya Candra ... Peter Scarth
01 Aug 2017
01 Aug 2017

RegiNet: Gradient guided multispectral image registration using convolutional neural networks
Zeming Wei ... Chen Su
Neurocomputing | VOL. 415
Zeming Wei, et. al.Zeming Wei ... Chen Su
22 Jul 2020
Neurocomputing | VOL. 415

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dual Compositional Learning in Interactive Image Retrieval

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence