From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation

Xin Liu,Zitong Yu,Jingyu Yang,Huanjing Yue,Chao Hao

doi:10.1145/3687474

Abstract

The action anticipation task refers to predicting what action will happen based on observed videos, which requires the model to have a strong ability to summarize the present and then reason about the future. Experience and common sense suggest that there is a significant correlation between different actions, which provides valuable prior knowledge for the action anticipation task. However, previous methods have not effectively modeled this underlying statistical relationship. To address this issue, we propose a novel end-to-end video modeling architecture that utilizes attention mechanisms, named Anticipation via Recognition and Reasoning (ARR). ARR decomposes the action anticipation task into action recognition and sequence reasoning tasks, and effectively learns the statistical relationship between actions by next action prediction (NAP). In comparison to existing temporal aggregation strategies, ARR is able to extract more effective features from observable videos to make more reasonable predictions. In addition, to address the challenge of relationship modeling that requires extensive training data, we propose an innovative approach for the unsupervised pre-training of the decoder, which leverages the inherent temporal dynamics of video to enhance the reasoning capabilities of the network. Extensive experiments on the Epic-kitchen-100, EGTEA Gaze+, and 50salads datasets demonstrate the efficacy of the proposed methods. The code is available at https://github.com/linuxsino/ARR .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications

Lead the way for us

Journal: ACM Transactions on Multimedia Computing, Communications, and Applications	Publication Date: Aug 28, 2024
License type: other-oa

Similar Papers

Action anticipation for collaborative environments: The impact of contextual information and uncertainty-based prediction
Clebeson Canuto ... José Santos-Victor
Neurocomputing | VOL. 444
Clebeson Canuto, et. al.Clebeson Canuto ... José Santos-Victor
24 Nov 2020
Neurocomputing | VOL. 444

Future Transformer for Long-term Action Anticipation
Dayoung Gong ... Joonseok Lee
-
Dayoung Gong, et. al.Dayoung Gong ... Joonseok Lee
01 Jun 2022
01 Jun 2022

METIER
Ling Chen ... Yi Zhang
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies | VOL. 4
Ling Chen, et. al.Ling Chen ... Yi Zhang
18 Mar 2020
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies | VOL. 4

Supervised Spatio-Temporal Neighborhood Topology Learning for Action Recognition
Andy J Ma ... Wilman W W Zou
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 23
Andy J Ma, et. al.Andy J Ma ... Wilman W W Zou
01 Aug 2013
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications