A Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection

Bharat Singh,Michael Jones,Ming Shao,Oncel Tuzel,Tim K Marks

doi:10.1109/cvpr.2016.216

Abstract

We present a multi-stream bi-directional recurrent neural network for fine-grained action detection. Recently, twostream convolutional neural networks (CNNs) trained on stacked optical flow and image frames have been successful for action recognition in videos. Our system uses a tracking algorithm to locate a bounding box around the person, which provides a frame of reference for appearance and motion and also suppresses background noise that is not within the bounding box. We train two additional streams on motion and appearance cropped to the tracked bounding box, along with full-frame streams. Our motion streams use pixel trajectories of a frame as raw features, in which the displacement values corresponding to a moving scene point are at the same spatial position across several frames. To model long-term temporal dynamics within and between actions, the multi-stream CNN is followed by a bi-directional Long Short-Term Memory (LSTM) layer. We show that our bi-directional LSTM network utilizes about 8 seconds of the video sequence to predict an action label. We test on two action detection datasets: the MPII Cooking 2 Dataset, and a new MERL Shopping Dataset that we introduce and make available to the community with this paper. The results demonstrate that our method significantly outperforms state-of-the-art action detection methods on both datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Action Recognition in Videos Using Pre-Trained 2D Convolutional Neural Networks
Jun-Hwa Kim ... Chee Sun Won
IEEE Access | VOL. 8
Jun-Hwa Kim, et. al.Jun-Hwa Kim ... Chee Sun Won
01 Jan 2020
IEEE Access | VOL. 8

Action detection based on tracklets with the two-stream CNN
Minwen Zhang ... Chenqiang Gao
Multimedia Tools and Applications | VOL. 77
Minwen Zhang, et. al.Minwen Zhang ... Chenqiang Gao
24 Aug 2017
Multimedia Tools and Applications | VOL. 77

Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling
Ashok Sarabu ... Ajit Kumar Santra
Data | VOL. 5
Ashok Sarabu, et. al.Ashok Sarabu ... Ajit Kumar Santra
11 Nov 2020
Data | VOL. 5

Temporal grafter network: Rethinking LSTM for effective video recognition
Bingbing Zhang ... Peihua Li
Neurocomputing | VOL. 505
Bingbing Zhang, et. al.Bingbing Zhang ... Peihua Li
16 Jul 2022
Neurocomputing | VOL. 505

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection

Abstract

Talk to us

Similar Papers