Abstract

Cross-media retrieval arouses considerable attentions and becomes a more and more worthwhile research direction in the domain of information retrieval. Different from many related works which perform retrieval by mapping heterogeneous data into a common representation subspace using a couple of projection matrices, we input multi-modal media data into a model of neural network which utilize a deep sparse neural network pre-trained by restricted Boltzmann machines and output their semantic understanding for semantic matching (RSNN-SM). Consequently, the heterogeneous modality data are represented by their top-level semantic outputs, and cross-media retrieval is performed by measuring their semantic similarities. Experimental results on several real-world datasets show that, RSNN-SM obtains the best performance and outperforms the state-of-the-art approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.