Abstract
Cross-media retrieval has received a great deal of attention on account of massive significant breakthroughs in the field of single modality retrieval. Therefore, mismatches (so called heterogeneity) become inevitable due to the inconsistent low-level features and the well-known semantic gap of data from different modalities. In order to deal with the mismatch problem, canonical correlation analysis (CCA), a classic two view approach for mapping different multimedia data into a common latent space, is used for cross-media retrieval tasks. To improve the performance of CCA, in this paper, we adopt Deep Canonical Correlation Analysis (DCCA), a nonlinear expansion of CCA to explore cross-media relations. Like most deep learning methods, DCCA is easy to over-fitting. To overcome over-fitting, we employ the Levenberg-Marquardt (LM) algorithm as training method in our deep model. Experiment results on our collected image-audio dataset and publicly available dataset (Wikipedia Articles) are encouraging, and show that the performance of our proposed model (LM-DCCA) is effective from multiple perspectives.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have