Multi-stream CNN: Learning representations based on human-related regions for action recognition

Zhigang Tu,Wei Xie,Qianqing Qin,Ronald Poppe,Remco C Veltkamp,Baoxin Li,Junsong Yuan

doi:10.1016/j.patcog.2018.01.020

Abstract

The most successful video-based human action recognition methods rely on feature representations extracted using Convolutional Neural Networks (CNNs). Inspired by the two-stream network (TS-Net), we propose a multi-stream Convolutional Neural Network (CNN) architecture to recognize human actions. We additionally consider human-related regions that contain the most informative features. First, by improving foreground detection, the region of interest corresponding to the appearance and the motion of an actor can be detected robustly under realistic circumstances. Based on the entire detected human body, we construct one appearance and one motion stream. In addition, we select a secondary region that contains the major moving part of an actor based on motion saliency. By combining the traditional streams with the novel human-related streams, we introduce a human-related multi-stream CNN (HR-MSCNN) architecture that encodes appearance, motion, and the captured tubes of the human-related regions. Comparative evaluation on the JHMDB, HMDB51, UCF Sports and UCF101 datasets demonstrates that the streams contain features that complement each other. The proposed multi-stream architecture achieves state-of-the-art results on these four datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Pattern Recognition	Publication Date: Feb 10, 2018
Citations: 201	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Multi-stream CNN: Learning representations based on human-related regions for action recognition

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition

Lead the way for us

Similar Papers

A resource conscious human action recognition framework using 26-layered deep convolutional neural network
Muhammad Attique Khan ... Yu-Dong Zhang
Multimedia Tools and Applications | VOL. 80
Muhammad Attique Khan, et. al.Muhammad Attique Khan ... Yu-Dong Zhang
01 Aug 2020
Multimedia Tools and Applications | VOL. 80

Human Activity Recognition in a Realistic and Multiview Environment Based on Two-Dimensional Convolutional Neural Network
Ashish Khare ... Om Prakash
Journal of Artificial Intelligence and Technology | VOL. -
Ashish Khare, et. al. Ashish Khare ... Om Prakash
09 May 2023
Journal of Artificial Intelligence and Technology | VOL. -

MSR-CNN: Applying motion salient region based descriptors for action recognition
Zhigang Tu ... Baoxin Li
-
Zhigang Tu, et. al.Zhigang Tu ... Baoxin Li
01 Dec 2016
01 Dec 2016

Analysis of CNN Architectures for Human Action Recognition in Video
David Silva ... Fernando Gaxiola
Computación y Sistemas | VOL. 26
David Silva, et. al.David Silva ... Fernando Gaxiola
30 Jun 2022
Computación y Sistemas | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-stream CNN: Learning representations based on human-related regions for action recognition

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition