Image-Text Cross-Media Feature Correlation based on Adversarial Network

Ying Xia,Gengquan Tian

doi:10.1109/ccis48116.2019.9073692

Abstract

With the emergence of mass multimedia data on Internet, there is an increasing demand for cross-media retrieval applications. The modal feature representation capability is crucial for cross-media retrieval. This paper proposes an improved feature correlation model of image and text modalities based on adversarial network, called FCMAN, which uses the adversarial network to establish feature correlation between image and text modalities. Firstly, different features of the image modality are fused to enhance the feature representation capability. Secondly, based on the feature correlation modeling of a single adversarial network, two new adversarial networks are introduced to model the real labels and predictive labels of image and text modalities respectively. Experimental results show that the proposed approach can establish feature correlation between image and text modalities effectively.

Full Text