Club Ideas and Exertions: Aggregating Local Predictions for Action Recognition

Congqi Cao,Yanning Zhang,Runping Xi,Jiakang Li

doi:10.1109/tcsvt.2020.3017203

Abstract

Recognizing the actions performed in a video is challenging for an intelligent system since there are wide variations and enormous information in the video. Attention mechanism pays attention to key target areas, ignores irrelevant information and extracts more discriminant features. In recent years, attention mechanism has been introduced into video recognition. Although a rich literature has been spawned, most of the research on attention aims to aggregate local features by attention. Instead of feature aggregation, we propose to aggregate decisions based on local spatio-temporal attention regions for action recognition, which is inspired by ensemble learning. The proposed decision fusion module is easy to interpret and architecture-independent. In this article, the regions around the body joints are regarded as the key regions. We use the corresponding regions of the body joints in the 3-D feature maps as the basic local features for local classification. Finally, all the local classification results are combined to make a global decision. Furthermore, when training the network, we can selectively add supervision to the local and global decisions. We experimentally show that the proposed mechanism can improve the recognition performance on multiple datasets which demonstrates its effectiveness.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Club Ideas and Exertions: Aggregating Local Predictions for Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society

Lead the way for us

Journal: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society	Publication Date: Aug 17, 2020
Citations: 4

Similar Papers

Symmetrical irregular local features for fine-grained visual classification
Ming Yang ... Zhihui Wei
Neurocomputing | VOL. 505
Ming Yang, et. al.Ming Yang ... Zhihui Wei
18 Jul 2022
Neurocomputing | VOL. 505

Local Heterogeneous Features for Person Re-Identification in Harsh Environments
Haijia Zhang ... Zhong Zhang
IEEE access : practical innovations, open solutions | VOL. 8
Haijia Zhang, et. al.Haijia Zhang ... Zhong Zhang
01 Jan 2020
IEEE access : practical innovations, open solutions | VOL. 8

Human action recognition in surveillance video of a computer laboratory
Abdul-Lateef Yussiff ... Baharum B Baharudin
-
Abdul-Lateef Yussiff, et. al.Abdul-Lateef Yussiff ... Baharum B Baharudin
01 Aug 2016
01 Aug 2016

Semantic-Augmented Local Decision Aggregation Network for Action Recognition
Congqi Cao ... Yanning Zhang
-
Congqi Cao, et. al.Congqi Cao ... Yanning Zhang
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Club Ideas and Exertions: Aggregating Local Predictions for Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society