Prototype-Based Sample Selection for Active Hashing

Cheong Hee Park

doi:10.3844/jcssp.2015.839.844

Abstract

Several hashing-based methods for Approximate Nearest Neighbors (ANN) search in a large data set have been proposed recently. In particular, semi-supervised hashing utilizes semantic similarity given for a small fraction of pairwise data samples and active hashing aims to improve the performance for ANN search by relying on an expert for the labeling of the most informative points. In this study, we present an active hashing method by prototype-based sample selection. Knowing semantic similarities between cluster prototypes can help extracting relations among the points in the corresponding clusters. For expert labeling, we select prototypes from clusters which do not contain any data points with labeled information so that all areas can be covered effectively. Experimental results demonstrate that the proposed active hashing method improves the performance for ANN search.

Highlights

As a huge size of data collection becomes easier to obtain, efficient methods for nearest neighbors search are needed in various areas such as data mining and pattern recognition (Shakhnarovich et al, 2006)
Unlike data-independent hashing in Locality Sensitive Hashing (LSH), several data dependent hashing methods including Spectral Hashing (SH) (Weiss et al, 2008) and Binary Reconstructive Embedding (BRE) (Kulis et al, 2009) learn hash functions from training data so that similar data points in the original space are mapped to near points in the binary embedding space
We present an active hashing method by prototype-based sample selection

Summary

Introduction

As a huge size of data collection becomes easier to obtain, efficient methods for nearest neighbors search are needed in various areas such as data mining and pattern recognition (Shakhnarovich et al, 2006). In hashing-based methods, by mapping data points to k-bit binary codes, nearest neighbors are searched in a binary embedding space. Semi-supervised hashing utilizes semantic similarity which is given in terms of two categories of relations for a fraction of pairwise data samples: Must-link and cannot-link (Wang et al, 2012; Mu et al, 2010). We present an active hashing method by prototype-based sample selection. Assuming clusters and their cluster prototypes are found, it is well known that prototypes can be used to find nearest neighbors efficiently (Tan et al, 2014).

Related Works

Experimental Results for Active Hashing

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Computer Science	Publication Date: Jul 1, 2015
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Prototype-Based Sample Selection for Active Hashing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science

Lead the way for us

Similar Papers

SRIANN: Sphere Ring Intersection for Approximate Nearest Neighbor Search in Videos
Yi-Ting Chen ... Shao-Yi Chien
-
Yi-Ting Chen, et. al.Yi-Ting Chen ... Shao-Yi Chien
01 Oct 2018
01 Oct 2018

Secure Approximate Nearest Neighbor Search over Encrypted Data
Yaqian Gao ... Xiaofeng Chen
-
Yaqian Gao, et. al.Yaqian Gao ... Xiaofeng Chen
01 Nov 2014
01 Nov 2014

Asymmetric Mapping Quantization for Nearest Neighbor Search.
Weixiang Hong ... Xueyan Tang
IEEE transactions on pattern analysis and machine intelligence | VOL. 42
Weixiang Hong, et. al.Weixiang Hong ... Xueyan Tang
27 Jun 2019
IEEE transactions on pattern analysis and machine intelligence | VOL. 42

A New Cell-Level Search Based Non-Exhaustive Approximate Nearest Neighbor (ANN) Search Algorithm in the Framework of Product Quantization
Yang Wang ... Zhibin Pan
IEEE Access | VOL. 7
Yang Wang, et. al.Yang Wang ... Zhibin Pan
01 Jan 2019
IEEE Access | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prototype-Based Sample Selection for Active Hashing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science