Content-oriented multimedia document understanding through cross-media correlation

Tong Lu,Feng Su,Palaiahnakote Shivakumara,Yukang Jin,Chew Lim Tan

doi:10.1007/s11042-014-2044-9

Abstract

This paper presents a novel method for multimedia document content analysis through modeling multimodal data correlations. We hypothesize that the correlation of different modalities from the same data source can help achieve better multimedia content understanding results than one which explores a single modality. We turn this task into two parts: multimedia data fusion and multimodal correlation propagation. During the first stage, we re-organize the training multimedia data into Modality semAntic Documents (MADs) after extracting quantized multimodal features, and then use multivariate Gaussian distributions to characterize the continuous quantity by latent topic modeling. Model parameters are asymmetrically learned to initialize multimodal correlations in the latent topic space. Accordingly, during the second stage, we construct a Multimodal Correlation Network (MCN) based on the initialized multimodal correlations, and a new mechanism of propagating inter-modality correlations and intra-modality similarities in MCN is further proposed to take the complementary from cross-modalities to facilitate multimedia content analysis. The experimental results of image-audio data retrieval on a 10-categories dataset and content-oriented web page recommendation on the USTODAY dataset show the effectiveness of our method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Content-oriented multimedia document understanding through cross-media correlation

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications

Lead the way for us

Journal: Multimedia Tools and Applications	Publication Date: Jun 1, 2014
Citations: 12

Similar Papers

Correlation Net: Spatiotemporal multimodal deep learning for action recognition
Novanto Yudistira ... Takio Kurita
Signal Processing: Image Communication | VOL. 82
Novanto Yudistira, et. al.Novanto Yudistira ... Takio Kurita
10 Dec 2019
Signal Processing: Image Communication | VOL. 82

A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval
Wanxia Lin ... Feng Su
-
Wanxia Lin, et. al.Wanxia Lin ... Feng Su
01 Jan 2012
01 Jan 2012

Multimedia based student-teacher smart interaction framework using multi-agents in eLearning
Muhammad Munwar Iqbal ... Mucheol Kim
Multimedia Tools and Applications | VOL. 77
Muhammad Munwar Iqbal, et. al.Muhammad Munwar Iqbal ... Mucheol Kim
29 Apr 2017
Multimedia Tools and Applications | VOL. 77

Multimedia Representation
Bo Yang
-
Bo YangBo Yang
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Content-oriented multimedia document understanding through cross-media correlation

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications