Temporal Action Detection based on Temporal Deformable Proposal Generation

Liting Yan,Huifen Xia,Yongzhao Zhan

doi:10.1109/itia50152.2020.9312328

Abstract

Temporal action detection is a challenging task in video understanding. Locating the activities more accurately on fuzzy temporal action boundary is a difficulty. So it is crucial to generate temporal proposals with precise boundaries and high-quality. To better solve this issue, we propose a novel temporal action detection method based on Temporal Deformable Proposal Generation (TDPG). In TDPG, modified C3D network is used to extract robust features. Deformable Proposals Generation module adaptively expands the receptive field through temporal deformable convolution to generate deformable proposals with more precise temporal boundaries. And then selected deformable proposal segments are used to predict action classification scores and refine temporal boundary. Our method is trained end to end with jointly optimized classification and regression loss. Experimental results show that our method achieves better performance than state-of-the-art method on two public datasets THUMOS’14 and Charades.

Full Text