Abstract

Temporal action proposal generation for temporal action localization aims to capture temporal intervals that are likely to contain actions from untrimmed videos. Prevailing bottom-up proposal generation methods locate action boundaries (the start and the end) with high classifying probabilities. But for many actions, motions at boundaries are not discriminative, which makes action segments and background segments be classified into boundary classes, thereby generating low-overlap proposals. In this work, we propose a novel method that generates proposals by evaluating the continuity of video frames, and then locates the start and the end with low continuity. Our method consists of two modules: boundary discrimination and proposal evaluation. The boundary discrimination module trains a model to understand the relationship between two frames and uses the continuity of frames to generate proposals. The proposal evaluation module removes background proposals via a classification network, and evaluates the integrity of proposals with probability features by an integrity network. Extensive experiments are conducted on two challenging datasets: THUMOS14 and ActivityNet 1.3, and the results demonstrate that our method outperforms the state-of-the-art proposal generation methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call