Learning Disentangled Classification and Localization Representations for Temporal Action Localization

Zixin Zhu,Nanning Zheng,Ziyi Liu,Le Wang,Wei Tang,Gang Hua

doi:10.1609/aaai.v36i3.20277

Abstract

A common approach to Temporal Action Localization (TAL) is to generate action proposals and then perform action classification and localization on them. For each proposal, existing methods universally use a shared proposal-level representation for both tasks. However, our analysis indicates that this shared representation focuses on the most discriminative frames for classification, e.g., ``take-offs" rather than ``run-ups" in distinguishing ``high jump" and ``long jump", while frames most relevant to localization, such as the start and end frames of an action, are largely ignored. In other words, such a shared representation can not simultaneously handle both classification and localization tasks well, and it makes precise TAL difficult. To address this challenge, this paper disentangles the shared representation into classification and localization representations. The disentangled classification representation focuses on the most discriminative frames, and the disentangled localization representation focuses on the action phase as well as the action start and end. Our model could be divided into two sub-networks, i.e., the disentanglement network and the context-based aggregation network. The disentanglement network is an autoencoder to learn orthogonal hidden variables of classification and localization. The context-based aggregation network aggregates the classification and localization representations by modeling local and global contexts. We evaluate our proposed method on two popular benchmarks for TAL, which outperforms all state-of-the-art methods.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Disentangled Classification and Localization Representations for Temporal Action Localization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 5

Similar Papers

Enriching Local and Global Contexts for Temporal Action Localization
Zixin Zhu ... Nanning Zheng
-
Zixin Zhu, et. al.Zixin Zhu ... Nanning Zheng
01 Oct 2021
01 Oct 2021

ContextLoc++: A Unified Context Model for Temporal Action Localization.
Zixin Zhu ... Gang Hua
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 45
Zixin Zhu, et. al.Zixin Zhu ... Gang Hua
01 Aug 2023
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 45

A Novel Action Saliency and Context-Aware Network for Weakly-Supervised Temporal Action Localization
Yibo Zhao ... Meng Wang
IEEE Transactions on Multimedia | VOL. 25
Yibo Zhao, et. al.Yibo Zhao ... Meng Wang
01 Jan 2023
IEEE Transactions on Multimedia | VOL. 25

Learning Task-Specific and Shared Representations in Medical Imaging
Felix J S Bragman ... Ryutaro Tanno
-
Felix J S Bragman, et. al.Felix J S Bragman ... Ryutaro Tanno
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Disentangled Classification and Localization Representations for Temporal Action Localization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence