Temporal Segment Connection Network for Action Recognition

Qian Li,Tongtong Yuan,Wenzhu Yang,Yuxia Wang,Xiangyang Chen

doi:10.1109/access.2020.3027386

Qian Li, Tongtong Yuan + Show 3 more

Open Access

https://doi.org/10.1109/access.2020.3027386

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 30	License type: CC BY 4.0

Affiliation: Hebei University

Abstract

Two-stream Convolutional Neural Networks have shown excellent performance in video action recognition. Most existing works train each sampling group independently, or just fuse at the last level, which obviously ignore the continuity of action in temporal and the complementary information between action fragments. In this paper, a temporal segment connection network is proposed to overcome these limitations. On the one hand, the forget gate module of the long short-term memory (LSTM) network is used to establish feature-level connections between each sampling group. This not only strengthens the information transmission between the sampling groups to enhance the temporal connectivity, but also extracts the complementary information between the sampling groups to enhance the overall representation of the action. On the other hand, a bi-directional long short-term memory (Bi-LSTM) network is used to automatically evaluate the importance weights of each sampling group based on the deep feature sequence. The experimental results on UCF101 and HMDB51 datasets show that the proposed model can effectively improve the utilization rate of temporal information and the ability of overall action representation, thus significantly improves the accuracy of human action recognition.

Highlights

Video-based action recognition attracts extensive attention due to its applications in many fields like security and behavior analysis
The original two-stream convolutional neural network can combine spatial and temporal information, but it only focuses on short-term motion changes and does not capture long-term information about the video
EXPERIMENTAL RESULTS AND DISCUSSIONS In order to verify the effect of the temporal segment connection network (TSCN) model on action recognition, the basic model Temporal Segment Network (TSN) is used for comparison on HMDB51 and UCF101

Summary

INTRODUCTION

Video-based action recognition attracts extensive attention due to its applications in many fields like security and behavior analysis. Video action recognition includes both spatial appearance information and temporal motion information. The original two-stream convolutional neural network can combine spatial and temporal information, but it only focuses on short-term motion changes and does not capture long-term information about the video. To address this issue, Wang and Xiong [15] proposed a Temporal Segment Network (TSN) to extract several sampling groups from a video to enhance the long-term modeling ability of the network. Q. Li et al.: TSCN for Action Recognition sampling groups depends on the heterogeneity between them. The more heterogeneity between the sampling groups contain more complementary information.

RELATED WORKS

FORGET-GATE CONNECTION MODULE

ADAPTIVE WEIGHTING MODULE

EXPERIMENT AND ANALYSIS

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Temporal Segment Connection Network for Action Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

CCG supertagging with bidirectional long short-term memory networks
Rekia Kadari ... Weinan Zhang
Natural Language Engineering | VOL. 24
Rekia Kadari, et. al.Rekia Kadari ... Weinan Zhang
04 Sep 2017
Natural Language Engineering | VOL. 24

Gujarati Task Oriented Dialogue Slot Tagging Using Deep Neural Network Models
Rachana Parikh ... Hiren Joshi
-
Rachana Parikh, et. al.Rachana Parikh ... Hiren Joshi
01 Jan 2020
01 Jan 2020

Intracranial hemorrhage detection in 3D computed tomography images using a bi-directional long short-term memory network-based modified genetic algorithm.
Jewel Sengupta ... Robertas Alzbutas
Frontiers in Neuroscience | VOL. 17
Jewel Sengupta, et. al.Jewel Sengupta ... Robertas Alzbutas
04 Jul 2023
Frontiers in Neuroscience | VOL. 17

Learning Video Actions in Two Stream Recurrent Neural Network
Ehtesham Hassan
Pattern Recognition Letters | VOL. 151
Ehtesham HassanEhtesham Hassan
01 Nov 2021
Pattern Recognition Letters | VOL. 151

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Temporal Segment Connection Network for Action Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access