Abstract

Cross-modal image retrieval methods enable users to find desired images from a text query via the embedding space. However, most existing methods do not consider the semantically similar texts and images in the embedding space. In this paper, we propose a novel cross-modal image retrieval method that can consider the relationships between semantically similar texts and images. Our method constructs an embedding space consistent with the semantic similarity by using the object information in images. Experimental results verify that our method is effective for keeping the semantically similar texts and images close in the embedding space compared to the existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call