Multi-model Network for Fine-Grained Cross-Media Retrieval

Jiemi Bai,Yazhou Yao,Fumin Shen,Wankou Yang,Qiong Wang,Yichao Zhou

doi:10.1007/978-3-030-60639-8_16

Abstract

With the development of Internet, the forms of web data are rapidly increasing. However, existing cross-media retrieval methods mainly focus on coarse-grained, which is far from being satisfied in practical application. In addition, the heterogeneity gap among different types of media tends to result in inconsistent data representation, so the measuring similarity is quite challenging. In this work, we propose a novel multi-modal network for fine-grained cross-media retrieval. Specifically, our model consists of two networks, including proprietary networks and the common network. The proprietary network is designed as a single feature extraction network for each media to extract unique features for obtaining precise media feature representation. The common network is designed to extract common features of four different types of media. Comprehensive experiments demonstrate the effectiveness of our proposed approach. The source code and models of this work have been made public available at: https://github.com/fgcmr/fgcmr.

Full Text