Multimodal activity recognition with local block CNN and attention-based spatial weighted CNN

Suguo Zhu,Zhenying Fang,Yi Wang,Jun Yu,Junping Du

doi:10.1016/j.jvcir.2018.12.026

Abstract

Deep learning based human activity recognition approach combines spatial and temporal information to complete the recognition task. The temporal information is extracted by optical flow, which is always compensated by the warping method in order to achieve better performance. However, these methods usually take the global feature as the starting point, only consider global information of video frames, and ignore local information that reflects the changes of human behavior, causing the algorithm to be sensitive to the external environment such as occlusion, illumination change. In view of the above problems, this paper fuses the local spatial features of video frames, global spatial features and temporal features to recognize different actions, and further extracts the visual attention weight to make constraint on the global spatial features. Experiments show that the algorithm proposed in this paper has better accuracy compared with the existing methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multimodal activity recognition with local block CNN and attention-based spatial weighted CNN

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation

Lead the way for us

Journal: Journal of Visual Communication and Image Representation	Publication Date: Dec 28, 2018
Citations: 5

Similar Papers

Twin Non-local Attention Network with Frame-Similarity Loss for Video Instance Lane Detection
Lei Guo ... Bin Yang
-
Lei Guo, et. al.Lei Guo ... Bin Yang
25 Nov 2022
25 Nov 2022

Blind Video Quality Assessment for Ultra-High-Definition Video Based on Super-Resolution and Deep Reinforcement Learning.
Zefeng Ying ... Da Pan
Sensors | VOL. 23
Zefeng Ying, et. al.Zefeng Ying ... Da Pan
29 Jan 2023
Sensors | VOL. 23

Scene classification using local and global features with collaborative representation fusion
Jinyi Zou ... Qian Du
Information Sciences | VOL. 348
Jinyi Zou, et. al.Jinyi Zou ... Qian Du
13 Feb 2016
Information Sciences | VOL. 348

ResFlow: Multi-tasking of Sequentially Pooling Spatiotemporal Features for Action Recognition and Optical Flow Estimation
Tso-Hsin Yeh ... Li-Chen Fu
-
Tso-Hsin Yeh, et. al.Tso-Hsin Yeh ... Li-Chen Fu
01 Nov 2019
01 Nov 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multimodal activity recognition with local block CNN and attention-based spatial weighted CNN

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation