Abstract
Recently, significant progress has been made in graph-based hashing methods for the purpose of learning hash codes that can preserve semantic similarity. Many approaches have been formulated for supervised learning problems that require labels. However, large-scale labeled datasets are expensive to obtain, especially when the data are multimodal, thus imposing a restriction on the usage of such algorithms. In this study, a novel multi-view graph cross-modal hashing (MGCH) framework is proposed to generate hash codes in a semi-supervised manner using the outputs of multi-view graphs processed by a graph-reasoning module. In contrast to conventional graph-based hashing methods, MGCH adopts multi-view graphs as the only learning assistance to connect labeled and unlabeled data in the process of pursuing binary embeddings. Multiview graphs that filter the features of multidirectional data in multiple anchor sets are beneficial in refining features. As the core component of our MGCH, an intuitive graph reasoning network consisting of two graph convolutional layers and one graph attention layer is employed to simultaneously convolve anchor graphs and asymmetric graphs with input data. Comprehensive cross-modal hashing evaluations on the Wiki, MIRFlickr-25K, NUS-WIDE, and MSCOCO datasets demonstrate the superiority of MGCH over the latest methods for limited labeled data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have