Abstract
Cross-media hashing, which conducts cross-media retrieval by embedding data from different modalities into a common low-dimensional hamming space, has attracted intensive attention in recent years. This is motivated by the facts a) the multi-modal data is widespread, e.g., the web images on Flickr are associated with tags, and b) hashing is an effective technique towards large-scale high-dimensional data processing, which is exactly the situation of cross-media retrieval. Inspired by recent advances in deep learning, we propose a cross-media hashing approach based on multi-modal neural networks. By restricting in the learning objective a) the hash codes for relevant cross-media data being similar, and b) the hash codes being discriminative for predicting the class labels, the learned Hamming space is expected to well capture the cross-media semantic relationships and to be semantically discriminative. The experiments on two real-world data sets show that our approach achieves superior cross-media retrieval performance compared with the state-of-the-art methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.