Abstract

Detecting actions temporally in untrimmed videos is very challenging, and it accomplishes action classification and localization simultaneously. Capturing the relations among action proposals (i.e., candidate video segments) is of vital importance. While there have been several attempts to encode such relations, they neglect the adverse effects of those irrelevant or negative relations among proposals. Besides, there is a crucial fact that action durations are flexible in videos, which has not been well explored. For the former, we develop a truncated attention mechanism that learns positive proposal relations by dynamically adjusting edge weights of proposal nodes in a graph, and construct the proposal network model using graph convolution networks to suppress disadvantageous relations of proposal pairs by truncating negative attention scores. For the latter, we devise a light multi-scale dilation module shared by all proposals to handle different action durations by enlarging temporal receptive field, thus capturing temporal context to increase the representation capacity of proposals. Unifying these considerations, we present the Multi-scale Dilation based Truncated Attention Proposal Network (MD-TAPN) model for temporal action detection. Our model achieves state-of-the-art performances of detecting actions on two benchmark databases, and especially it outperforms the most competitive method by a significant gain of 3.6% mAP at tIoU0.5 on THUMOS14.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.