Fusing inherent and external knowledge with nonlinear learning for cross-media retrieval

Hong Zhang,Yun Liu,Zhigang Ma

doi:10.1016/j.neucom.2012.03.033

Abstract

Cross-media retrieval focuses on searching multimedia data of different modalities with content-based methods. However, most of those methods are designed for multimedia retrieval in single modality, such as image retrieval and audio retrieval. Though a few work has focused on cross-media retrieval, the performance is yet to be satisfactory and the potential of using cross-media retrieval for boosted retrieval performance remains largely unexplored. Hence, in this paper, we propose a novel cross-media retrieval approach for general multimedia data, such as image and audio. First, image and audio samples are mapped into an isomorphic feature subspace with kernel-based method; second, multimedia semantics is learned from inherent feature correlation by local linear regression; also a graph model is constructed to utilize external knowledge from relevance feedback; then we build a unified objective function integrating inherent and external learning results, and by solving the objective function we calculate a multimodal semantic space where cross-media retrieval among image and audio is enabled. Extensive experiments have validated the proposed methods with encouraging results.

Full Text