Affective video content analysis based on multimodal data fusion in heterogeneous networks

Jie Guo,Bin Song,Peng Zhang,Mengdi Ma,Wenwen Luo,Junmei Lv

doi:10.1016/j.inffus.2019.02.007

Abstract

In heterogeneous networks, different modalities are coexisting. For example, video sources with certain lengths usually have abundant time-varying audiovisual data. From the users’ perspective, different video segments will trigger different kinds of emotions. In order to better interact with users in heterogeneous networks and improve their user experiences, affective video content analysis to predict users’ emotions is essential. Academically, users’ emotions can be evaluated by arousal and valence values, and fear degree, which provides an approach to quantize the prediction accuracy of the reaction of the audience and users towards videos. In this paper, we propose the multimodal data fusion method for integrating the visual and audio data in order to perform the affective video content analysis. Specifically, to align the visual and audio data, the temporal attention filters are proposed to obtain the time-span features of the entire video segments. Then, by using the two-branch network structure, matched visual and audio features are integrated in the common space. At last, the fused audiovisual feature is employed for the regression and classification subtasks in order to measure the emotional responses of users. Simulation results show that the proposed method can accurately predict the subjective feelings of users towards the video contents, which provides a way to predict users’ preferences and recommend videos according to their own demand.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Affective video content analysis based on multimodal data fusion in heterogeneous networks

Abstract

Talk to us

Similar Papers

More From: Information Fusion

Lead the way for us

Journal: Information Fusion	Publication Date: Feb 20, 2019
Citations: 19

Similar Papers

Synchronous prediction of arousal and valence using LSTM network for affective video content analysis
Ligang Zhang ... Jiulong Zhang
-
Ligang Zhang, et. al.Ligang Zhang ... Jiulong Zhang
01 Jul 2017
01 Jul 2017

A Deep Multimodal Model for Predicting Affective Responses Evoked by Movies Based on Shot Segmentation
Chunxiao Wang ... Wei Jiang
Security and Communication Networks | VOL. 2021
Chunxiao Wang, et. al.Chunxiao Wang ... Wei Jiang
28 Sep 2021
Security and Communication Networks | VOL. 2021

Circuit Design of Multimodal Attention Memristive Network for Affective Video Content Analysis
Xiaoyue Ji ... Chun Sing Lai
-
Xiaoyue Ji, et. al.Xiaoyue Ji ... Chun Sing Lai
04 Apr 2023
04 Apr 2023

Content-Based Analysis of Digital Video

-

01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Affective video content analysis based on multimodal data fusion in heterogeneous networks

Abstract

Talk to us

Similar Papers

More From: Information Fusion