Human-Object Contour for Action Recognition with Attentional Multi-modal Fusion Network

Miao Yu,Jie Li,Qingxiang Zeng,Weizhe Zhang,Chao Wang

doi:10.1109/icaiic.2019.8669069

Miao Yu, Jie Li + Show 3 more

https://doi.org/10.1109/icaiic.2019.8669069

Copy DOI

Export

Save

Cite

Publication Date: Feb 1, 2019

Citations: 3

Affiliation: Southeast University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Human action recognition has great research and application value in intelligent video surveillance, human-computer interaction and other communication fields. In order to improve the accuracy of human action recognition for video understanding, the extraction of human motion features and attentional fusion methods are studied. This paper has two main contributions. Firstly, based on the essence of optical flow validity, a novel dynamic feature expression method called Human-Object Contour(HOC) is presented, which combines object understanding and contextual information. Secondly, referring to the principle of Stacking in ensemble learning, we propose Attentional Multi-modal Fusion Network(AMFN). According to the characteristics of the video, attention is paid to selecting different modalities rather than simple averaging with fixed weight. The experiment shows that HOC is effectively complementary to the static appearance feature, and the accuracy of action recognition with our fusion network improves effectively. Our approach obtains the state-of-the-art performance on the datasets of HMDB51 (72.2%) and UCF101 (96.0%).

Full Text