Abstract

With the advantages of low storage cost and fast query speed, cross modal hashing has attracted increasing attention recently. However, most existing cross-modal hashing methods adopt the same measurement metric when processing data of different modalities or cannot explore heterogeneous correlation between different modalities well, which will result in information loss and heterogeneous correlation cannot be solved. In this paper, we propose a Deep semantic Cross Modal Hashing based on Graph similarity of Modal-Specific (DCMHGMS) method, which not only considers the inter-modal similarity but also designs two graphs to characterize the intra-modal similarity of modal-specific models. First, we use a weighted measurement metric of Euclidean distance and cosine distance to measure the inter-modal similarity between image and text, which can solve the heterogeneous correlation problem. Next, for image graph, we build the intra-modal similarity with Euclidean distance function. Then, for text graph, we build the intra-modal similarity with cosine distance function. Paying attention to the specifics of each modality can improve the retrieval accuracy, thus solving the problem of information loss. Moreover, the semantic information embedding, quantization loss, and bit balanced constraint are considered in this model. Experimental results on two datasets show the effectiveness of our proposed DCMHGMS method.

Highlights

  • W ITH the rapid development of computer network and explosive growth of multi-modal data, such as image, text, video, and audio, single modal retrieval can no longer meet the users’ needs, people pay more and more attentions to the retrieval between different modalities, including multimodal retrieval [2], [3] and cross modal retrieval

  • To address the above challenges, we propose a method termed Deep semantic Cross Modal Hashing based on Graph similarity of Modal-Specific (DCMHGMS)

  • Our proposed DCMHGMS achieves the best performance on two datasets, and Deep Cross-Modal Hashing (DCMH) obtains a lower performance than our method, this is because we adopt inter-modal similarity, intra-modal similarity and semantic embedding at the same time, which can improve the retrieval accuracy effectively

Read more

Summary

INTRODUCTION

W ITH the rapid development of computer network and explosive growth of multi-modal data, such as image, text, video, and audio, single modal retrieval can no longer meet the users’ needs, people pay more and more attentions to the retrieval between different modalities, including multimodal retrieval [2], [3] and cross modal retrieval. Cross modal hashing methods [4]–[17], [19], [22]–[24], [36], [37], [41] project high dimensional instances to a common hamming space, where the heterogeneous data can get a unified representation and be measured uniformly. Deep Cross Modal Retrieval (DCMH) [23] proposed an end-to-end framework which adopts the same measurement metric for image modality and text modality, and it does not take intra-modal similarity into account. The above cross modal hashing methods either adopt the same measurement metric when processing data of different modalities or adopt non-deep model, which either overlook the specifics of heterogeneous data and result in information loss or cannot explore heterogeneous correlation across different modalities well. To address the above challenges, we propose a method termed Deep semantic Cross Modal Hashing based on Graph similarity of Modal-Specific (DCMHGMS).

RELATED WORK
DEEP SEMANTIC CROSS MODAL HASHING BASED ON GRAPH SIMILARITY OF MODAL-SPECIFIC
OPTIMIZATION
OUT OF SAMPLE
BASELINE METHODS
Methods
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call