Abstract

Cross-modal hashing technology is a key technology for real-time retrieval of large-scale multimedia data in real-world applications. Although the existing cross-modal hashing methods have achieved impressive accomplishment, there are still some limitations: (1) some cross-modal hashing methods do not make full consider the rich semantic information and noise information in labels, resulting in a large semantic gap, and (2) some cross-modal hashing methods adopt the relaxation-based or discrete cyclic coordinate descent algorithm to solve the discrete constraint problem, resulting in a large quantization error or time consumption. Therefore, in order to solve these limitations, in this paper, we propose a novel method, named Discrete Semantics-Guided Asymmetric Hashing (DSAH). Specifically, our proposed DSAH leverages both label information and similarity matrix to enhance the semantic information of the learned hash codes, and the ℓ2,1 norm is used to increase the sparsity of matrix to solve the problem of the inevitable noise and subjective factors in labels. Meanwhile, an asymmetric hash learning scheme is proposed to efficiently perform hash learning. In addition, a discrete optimization algorithm is proposed to fast solve the hash code directly and discretely. During the optimization process, the hash code learning and the hash function learning interact, i.e., the learned hash codes can guide the learning process of the hash function and the hash function can also guide the hash code generation simultaneously. Extensive experiments performed on two benchmark datasets highlight the superiority of DSAH over several state-of-the-art methods.

Highlights

  • In recent years, due to the rapid development of multimedia Internet of Things technologies, there has been an explosive growth in the amount of multimedia network data.the current unimodal search methods can no longer meet the multimedia data retrieval requirements in the complex environment of the new information era.cross-modal retrieval methods [1,2,3] have received increasing attention from the information retrieval community and have become a hot research topic in both academia and industry

  • Our proposed Discrete Semantics-Guided Asymmetric Hashing (DSAH) leverages both the similarity matrix and label information to enhance the semantic information of the learned hash codes, and solves the problem of noises contained in the labels

  • On the MIRFlickr dataset, compared to the best baselines, i.e., Subspace Relation Learning for Cross-modal Hashing (SRLCH), the mean average precision (mAP) scores of DSAH have an increase of 2.7% on average, and on the NUS-WIDE dataset, DSAH obtains the highest mAP scores of all compared baselines, which demonstrates the efficacy of DSAH

Read more

Summary

Introduction

Due to the rapid development of multimedia Internet of Things technologies, there has been an explosive growth in the amount of multimedia network data. Some cross-modal hashing methods are based on symmetric learning strategies, resulting in a worse retrieval performance than asymmetric learning ones. DSAH handles the nonlinear relations in different modalities with a kernelization technique, an asymmetric learning scheme is proposed to effectively perform the hash function learning and hash code learning processes; our proposed. We leverage both label information and similarity matrix to enhance the semantic information of the learned hash codes. A novel supervised cross-modal hashing method, i.e., DSAH, is proposed to learn the discriminative compact hash codes for large-scale retrieval tasks. DSAH takes the label information and similarity matrix into consideration, which can improve the discriminative capability of the learned hash codes, and solves the problems of matrix sparseness and outlier processing.

Related Works
Unsupervised Hashing
Supervised Hashing
Deep Hashing
The Proposed DSAH Framework
Kernelization
Feature Mapping
Label Alignment Scheme
Asymmetric Learning Framework
The Joint Framework
Optimization
Out-of-Sample Extension
Complexity Analysis
MIRFlickr
NUS-WIDE
Methodology
Implementation Details
Results
Method
Effects of Discrete Optimization
Effects of Kernelization
Effects of Word Embeddings
Effects of Deep Learning Based Representation
Effects of Parameters
Convergence Analysis
Limitations
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call