Spatial–Temporal Context-Aware Online Action Detection and Prediction

Jingjia Huang,Thomas Li,Nannan Li,Shan Liu,Ge Li

doi:10.1109/tcsvt.2019.2923712

Abstract

Spatial-temporal action detection in videos is a challenging problem that has attracted considerable attention in recent years. Most current approaches address action detection as an object detection problem, which utilizes successful object detection frameworks such as Faster R-CNN to operate action detection at every single frame first, and then generates action tubes by linking bounding boxes across the whole video in an offline fashion. However, unlike object detection in static images, temporal context information is vital for action detection in videos. Therefore, we propose an online action detection model that leverages the spatial-temporal context information existing in videos to perform action inference and localization. More specifically, we try to depict the spatial-temporal context pattern of actions via an encoder-decoder model that is based on a convolutional recurrent neural network. The model accepts a video snippet as input and encodes the dynamic information inside the snippet in the forward pass. During the backward pass, the decoder resolves the information for action detection with the current appearance or motion cue at each time stamp. In addition, we devise an incremental action-tube construction algorithm that enables our model to accomplish action prediction ahead of time and performs action detection in an online fashion. To evaluate the performance of our method, we conduct experiments on three popular public datasets UCF-101, UCF-Sports, and J-HMDB-21. The experimental results demonstrate that our method can achieve competitive or superior performance when compared to the state-of-the-art methods. To encourage further research, we release our project on “https://github.com.hjjpku.OATD.”

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spatial–Temporal Context-Aware Online Action Detection and Prediction

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society

Lead the way for us

Journal: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society	Publication Date: Jun 27, 2019
Citations: 42

Similar Papers

A Proposal-Based Solution to Spatio-Temporal Action Detection in Untrimmed Videos
Joshua Gleason ... Steven Schwarcz
-
Joshua Gleason, et. al.Joshua Gleason ... Steven Schwarcz
01 Jan 2019
01 Jan 2019

Online Action Tube Detection via Resolving the Spatio-temporal Context Pattern
Jingjia Huang ... Thomas H Li
-
Jingjia Huang, et. al.Jingjia Huang ... Thomas H Li
15 Oct 2018
15 Oct 2018

Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos
Rui Hou ... Chen Chen
-
Rui Hou, et. al.Rui Hou ... Chen Chen
01 Oct 2017
01 Oct 2017

Leveraging Context for Multi-Label Action Recognition and Detection in Video

-

05 Nov 2020
05 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spatial–Temporal Context-Aware Online Action Detection and Prediction

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society