Abstract

Online action detection aims to detect a current action from an untrimmed, streaming video, where only current and past frames are available. Recent methods for online action detection have focused on how to model discriminative representations from temporally partial information. However, they overlook the fact that the input video contains background as well as actions. To overcome this problem, in this paper, we propose a novel approach, named Temporal Filtering Network, to distinguish between relevant and irrelevant information from a partially observed, untrimmed video. Specifically, we present a filtering module to learn relevance scores indicating how relevant the information is to a current action. Our filtering module emphasizes the relevant information to a current action, while it filters out the information of background and unrelated actions. We conduct extensive experiments on THUMOS-14 and TVSeries datasets. On these datasets, the proposed method outperforms state-of-the-art methods by a large margin. We also show the effectiveness of the filtering module through comprehensive ablation studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call