Interpretable and effective hashing via Bernoulli variational auto-encoders

Francisco Mena,Ricardo Ñanculef,Carlos Valle

doi:10.3233/ida-200013

Abstract

Due to the rapid increase in the amount of data generated in many fields of science and engineering, information retrieval methods tailored to large-scale datasets have become increasingly important in the last years. Semantic hashing is an emerging technique for this purpose that works on the idea of representing complex data objects, like images and text, using similarity-preserving binary codes that are then used for indexing and search. In this paper, we investigate a hashing algorithm that uses a deep variational auto-encoder to learn and predict the codes. Unlike previous approaches of this type, that learn a continuous (Gaussian) representation and then project the embedding to obtain hash codes, our method employs Bernoulli latent variables in both the training and the prediction stage. Constraining the model to use a binary encoding allow us to obtain a more interpretable representation for hashing: each factor in the generative model represents a bit that should help to reconstruct and thus identify the input pattern. Interestingly, we found that the binary constraint does not lead to a loss but an increase of search accuracy. We argue that continuous formulations learn a representation that can significantly differ from the code used for search. Minding this gap in the design of the auto-encoder can translate into more accurate retrieval results. Extensive experiments on seven datasets involving image data and text data illustrate these findings and demonstrate the advantages of our approach.

Full Text