Abstract

Due to the progress of deep neural networks (DNN), DNN has been employed to cross-media retrieval. Existing cross-media retrieval methods based on DNN can convert separate representation of each media type to common representation by inter-media and intra-media constraints. By using common representation, we can measure similarities between heterogeneous instances and perform cross-media retrieval. However, it is challenging to optimize common representation learning due to the inter-media and intra-media constraints, which is a multi-objective optimization problem. This paper proposes residual correlation network (RCN) to address this issue. RCN optimizes common representation learning with a residual function, which can fit the optimal mapping from separate representation to common representation and relieve the multi-objective optimization problem. The experiments show that proposed approach achieves the best accuracy compared with 10 state-of-the-art methods on 3 datasets.

Highlights

  • Multimedia retrieval has become an indispensable part of contemporary Internet development

  • It is challenging to optimize common representation learning since inter-media and intra-media constraints both need to be considered as objective functions [13, 14], which restrains the performance of cross-media retrieval

  • It is seen that residual correlation network (RCN) has two subnetworks: separate representation learning subnetwork and residual correlation learning subnetwork

Read more

Summary

Introduction

Multimedia retrieval has become an indispensable part of contemporary Internet development. It is challenging to optimize common representation learning since inter-media and intra-media constraints both need to be considered as objective functions [13, 14], which restrains the performance of cross-media retrieval. He et al [15] propose deep residual learning and introduce that it is easier to optimize the residual mapping than to optimize the original. It could fit the discrepancy between separate representations and common representations and further optimize the common representation learning Inspired by this paradigm, we propose residual correlation network (RCN) method to address the above cross-media optimization problem. The conclusion of this paper is concluded in “Conclusions” section

Methods
Experiments
Method
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.