A Temporal Action Detection Model With Feature Pyramid Network

Qirong Lai,Chunyan Yu,Xiu Wang

doi:10.1145/3443467.3443808

Qirong Lai, Chunyan Yu + Show 1 more

https://doi.org/10.1145/3443467.3443808

Copy DOI

Export

Save

Cite

Publication Date: Nov 6, 2020

Affiliation: Fuzhou University

Abstract
Full-Text
Similar Papers

Abstract

Listen

To find out all actions included in an untrimmed video, temporal action detection localizes the starting and ending of each action, and identify their categories, simultaneously. Different with trimmed video which always involves a single action instance, the untrimmed video is much more complicated. That is, there are not only multiple action instances, but also multiple background clips among action instances. This complexity presents a great challenge to temporal action detection. Structured Segment Networks, SSN, a recently presented temporal action detection method, constructs a two-stage pyramid structure to obtain temporal features of an action instance to complete its classification and location. SSN works well except that there are multiple action instances varying greatly in amplitude and duration. This paper introduces a feature pyramid network in the feature extraction phrase of SSN to expand the receptive field of the network to obtain features with different scales to predict action completeness, category, and boundary, respectively. Compared with the original SSN and other existing models, experiment results on dataset THUMOS14 shows the effectiveness of our method

Full Text