Abstract
Approximate nearest neighbors (ANN) search for large scale data has attracted considerable attention due to the fact that large amounts of data are easily available. Recently, hashing has been widely adopted for similarity search because of its good potential for low storage cost and fast query speed. Among of them, when semantic similarity information is available, supervised hashing methods show better performance than unsupervised ones. However, supervised hashing methods need explicit similarity information which is not available in some scenarios. In addition, they have the problems of difficult optimization and time consuming for training, which make them unpracticable to large scale data. In this paper, we propose an unsupervised hashing method – Unsupervised Euclidean Hashing (USEH), which learns and generates hashing codes to preserve the Euclidean distance relationship between data. Specifically, USEH first utilizes Locality-Sensitive Hashing (LSH) to generate pseudo labels; then, it adopts a sequential learning strategy to learn the hash functions, one bit at a time, which can generate very discriminative codes. Moreover, USEH avoids explicitly computing the similarity matrix by decomposing it into the product of a label matrix and its transposition, which makes the training complexity of USEH linear to the size of training samples when the number of training samples is much greater than the dimension of feature. Thus, it can efficiently work on large scale data. We test USEH on two large scale datasets – SIFT1M and GIST1M. Experimental results show that USEH is comparable to state-of-the-art unsupervised hashing methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.