Abstract

Recent years have witnessed the growing popularity of hashing for large-scale multimedia retrieval. Extensive hashing methods have been designed for data stored in a single machine, that is, centralized hashing . In many real-world applications, however, the large-scale data are often distributed across different locations, servers, or sites. Although hashing for distributed data can be implemented by assembling all distributed data together as a whole dataset in theory, it usually leads to prohibitive computation, communication, and storage costs in practice. Up to now, only a few methods were tailored for distributed hashing , which are all unsupervised approaches. In this paper, we propose an efficient and effective method called supervised distributed hashing (SupDisH), which learns discriminative hash functions by leveraging the semantic label information in a distributed manner. Specifically, we cast the distributed hashing problem into the framework of classification, where the learned binary codes are expected to be distinct enough for semantic retrieval. By introducing auxiliary variables, the distributed model is then separated into a set of decentralized subproblems with consistency constraints, which can be solved in parallel on each vertex of the distributed network. As such, we can obtain high-quality distinctive unbiased binary codes and consistent hash functions with low computational complexity, which facilitate tackling large-scale multimedia retrieval tasks involving distributed datasets. Experimental evaluations on three large-scale datasets show that SupDisH is competitive to centralized hashing methods and outperforms the state-of-the-art unsupervised distributed method significantly.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call