Abstract

Cross-media retrieval is becoming a new trend of information retrieval technique. It has been received great attentions from both academia and industry. In this paper, we propose an effective retrieval method, dubbed as Cross-media Retrieval with Collective Deep Semantic Learning (CR-CDSL), to solve the problem. Two complementary deep neural networks are first learned to collectively project image and text samples into a joint semantic representation. Based on it, weak semantic labels are then generated accordingly for unlabeled images and texts. They are exploited further with the pre-labeled training samples to retrain the retrieval model, which can discover a discriminative shared semantic space for achieving cross-media retrieval. Specifically, Deep Restricted Boltzmann Machines (DRBM) is employed to initialize the weights of two deep neural networks. With the weak labels generated from collective deep semantic learning, the discriminative capability of retrieval model can be enhanced. Thus, the retrieval performance of the model could be improved. Experiments are evaluated on several publicly available cross-media datasets. The obtained experimental results demonstrate the superior performance of the proposed approach compared with several state-of-the-art techniques.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.