Spatiotemporal Relation Networks for Video Action Recognition

Zheng Liu,Haifeng Hu

doi:10.1109/access.2019.2894025

Abstract

Two-stream convolutional networks have shown strong performance in a video action recognition task for its ability to capture spatial and temporal features simultaneously. However, the calculation of optical flow is time-consuming and it cannot be applied to the real-time processing of video. To address this problem, this paper proposes a new end-to-end architecture called SpatioTemporal Relation Networks (STRN) to extract spatial information and temporal information simultaneously from the video with the only RGB input. STRN consist of two branches, called appearance stream and motion stream, respectively. Appearance stream retains the structure of the original spatial stream in the two-stream architecture with the input of consecutive frames instead of a single frame. Motion stream, which takes relation information between the adjacent features in the appearance stream as an input, can effectively complement appearance stream. A relation block is an extractor which is used to extract relation information from the appearance stream. STRN can learn spatiotemporal information from the video with the only RGB input, which avoids the calculation of optical flow. We validate the STRN on UCF-101 and HMDB-51 and achieve better performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 11	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Spatiotemporal Relation Networks for Video Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

The Effect of Presmoothing Image Sequences on the Computation of Optical Flow
J V Condell ... B W Scotney
-
J V Condell, et. al.J V Condell ... B W Scotney
01 Jan 2006
01 Jan 2006

Computer Assisted Analysis of Echocardiographic Image Sequences
Andrea Giachetti ... Guido Gigli
-
Andrea Giachetti, et. al.Andrea Giachetti ... Guido Gigli
01 Jan 1995
01 Jan 1995

Computation of optic flow from the motion of edge features in image sequences
Bf Buxton ... H Buxton
Image and Vision Computing | VOL. 2
Bf Buxton, et. al.Bf Buxton ... H Buxton
01 May 1984
Image and Vision Computing | VOL. 2

Computer assisted analysis of echocardiographic image sequences
Andrea Giachetti ... Vincent Torre
-
Andrea Giachetti, et. al.Andrea Giachetti ... Vincent Torre
01 Jan 1995
01 Jan 1995

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spatiotemporal Relation Networks for Video Action Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Access