Integrating Gaussian mixture model and dilated residual network for action recognition in videos

Ming Fang,Chih-Cheng Hung,Fengqin Yang,Xiaoying Bai,Shuhua Liu,Jianwei Zhao

doi:10.1007/s00530-020-00683-4

Abstract

Action recognition in video is one of the important applications in computer vision. In recent years, the two-stream architecture has made significant progress in action recognition, but it has not systematically explored spatial–temporal features. Therefore, this paper proposes an integrated approach using Gaussian mixture model (GMM) and dilated convolution residual network (GD-RN) for action recognition. This method uses ResNet-101 as spatial and temporal stream ConvNet. On the one hand, this paper first sends the action video into the GMM for background subtraction and then sends the video marking the action profile to ResNet-101 for identification and classification. Compared with the baseline, ConvNet takes the original RGB image as input, which not only reduces the complexity of the video background, but also reduces the amount of computation of the learning space information. On the other hand, using the stacked optical flow images as the input of the ResNet-101 added to the dilated convolution, the convolution receptive field is expanded without lowering the resolution of the optical flow image, thereby improving the classification accuracy. The two ConvNet-independent learning spatial and temporal features of the GD-RN network finally fine-tune and fuse the spatio-temporal features to obtain the final action recognition accuracy. The action recognition method proposed in this paper is tested on the challenging UCF101 and HMDB51 datasets, and accuracy rates of 91.3% and 62.4%, respectively, are obtained, which proves the proposed method with the competitive results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Integrating Gaussian mixture model and dilated residual network for action recognition in videos

Abstract

Talk to us

Similar Papers

More From: Multimedia Systems

Lead the way for us

Journal: Multimedia Systems	Publication Date: Aug 20, 2020
Citations: 5

Similar Papers

Audio and Video Feature Fusion for Activity Recognition in Unconstrained Videos
José Lopes ... Sameer Singh
-
José Lopes, et. al.José Lopes ... Sameer Singh
01 Jan 2006
01 Jan 2006

Multi-stream with Deep Convolutional Neural Networks for Human Action Recognition in Videos
Xiao Liu ... Xudong Yang
-
Xiao Liu, et. al.Xiao Liu ... Xudong Yang
01 Jan 2018
01 Jan 2018

Understanding action recognition in still images
Deeptha Girish ... Anca Ralescu
-
Deeptha Girish, et. al.Deeptha Girish ... Anca Ralescu
01 Jun 2020
01 Jun 2020

DarkLight Networks for Action Recognition in the Dark
Rui Chen ... Zixi Liang
-
Rui Chen, et. al.Rui Chen ... Zixi Liang
01 Jun 2021
01 Jun 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Integrating Gaussian mixture model and dilated residual network for action recognition in videos

Abstract

Talk to us

Similar Papers

More From: Multimedia Systems