Abstract

Faced with a large amount of data and high-dimensional data information in a database, the existing exact nearest neighbor retrieval methods cannot obtain ideal retrieval results within an acceptable retrieval time. Therefore, researchers have begun to focus on approximate nearest neighbor retrieval. Recently, the hashing-based approximate nearest neighbor retrieval method has attracted increasing attention because of its small storage space and high retrieval efficiency. The development of neural networks has also promoted progress in hash learning. However, these methods are mostly supervised. In practical applications, annotating large amounts of data is a very time-consuming and laborious task. Furthermore, efficiently using a large amount of unlabeled data for hash learning is challenging. In this paper, we create a new autoencoder variant to efficiently capture the features of high-dimensional data, and propose an unsupervised deep hashing method for large-scale data retrieval, named as Autoencoder-based Unsupervised Clustering and Hashing (AUCH). By constructing a hashing layer as a hidden layer of the autoencoder, hash learning is performed together with unsupervised clustering by minimizing the overall loss. AUCH can unify unsupervised clustering and retrieval tasks into a single learning model. In addition, the method can use a deep neural network to simultaneously learn feature representations, hashing functions and cluster assignments. Experimental results on standard datasets indicate that AUCH achieves competitive results compared to state-of-the-art models for retrieval and clustering tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.