Action recognition based on element-level fine-grained multi-modal fusion

Guozheng Peng,Jiaxue Yang,Lixin Han

doi:10.1088/1742-6596/2010/1/012114

Action recognition based on element-level fine-grained multi-modal fusion

Guozheng Peng, Jiaxue Yang + Show 1 more

Open Access

https://doi.org/10.1088/1742-6596/2010/1/012114

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Sep 1, 2021
License type: cc-by

Affiliation: Hohai University

#Optical Flow Characteristics #Optical Flow Features + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Traditional action recognition algorithms often only pay attention to video RGB features or optical flow features. These methods do not make good use of the audio information in the video. Based on RGB and optical flow characteristics, this paper introduces the processing of audio information, and classifies videos based on element-level fine-grained multi-modal fusion. Through experimental comparison, the accuracy of the multi-modal fusion algorithm proposed in this paper is improved by 7.38% on the HMDB51 dataset and 3.18% on the UCF101 dataset compared to the simple modal splicing. At the same time, it is proved that the introduction of audio modes can effectively improve the performance of the model.

Full Text