Rank pooling dynamic network: Learning end-to-end dynamic characteristic for action recognition

Zhigang Zhu,Wenbo Zhang,Hongbing Ji,Yiping Xu

doi:10.1016/j.neucom.2018.08.018

Abstract

Abstract In video recognition, rank-pooling operators are a type of models for sorting video sequences, which act on either the raw inputs or the intermediate feature maps of convolutional neural network (CNN). However, such models are currently restricted in the optimization of the linear ranking function by Rank-SVM and Rank-SVR. In this paper, we first propose a CNN architecture called RGB Rank Pooling Dynamic Network (RGB-RPDN), mapping a video to multiple frame-level dynamic spaces with the same size as the input. Importantly, a classical classification (e.g. FC, CNN) advanced in 2D image can be jointly positioned behind the generated representation for action classification, thus the joint architecture can be trained in an end-to-end manner. Second, we analyze how the flow-level evolution can be modeled by the hand-crafted rank-pooling machine, and extend the dynamic space of frame-level to that of flow-level by the Flow Rank Pooling Dynamic Network (Flow-RPDN). Third, equivalence relations between hand-crafted rank-pooling and RPDN are formulated, further the comparison of computing cost are qualitatively analyzed. Finally, the frame-level and flow-level pipelines are combined to achieve the final prediction by the late fusion. Specifically, with the models pre-trained on the large-scale Kinetics dataset, we train the two-stream RPDN on the UCF101 and HMDB51, where the parameters are initialized by the pre-trained models above. Experimental results demonstrate that the RPDN significantly improves the hand-crafted rank-pooling machines by a large margin of promotion, and achieves the correct rate of more excellent classification in action recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Rank pooling dynamic network: Learning end-to-end dynamic characteristic for action recognition

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Aug 16, 2018
Citations: 3

Similar Papers

Hierarchical dynamic depth projected difference images–based action recognition in videos with convolutional neural networks
Hanbo Wu ... Xin Ma
International Journal of Advanced Robotic Systems | VOL. 16
Hanbo Wu, et. al.Hanbo Wu ... Xin Ma
01 Jan 2019
International Journal of Advanced Robotic Systems | VOL. 16

Discriminatively Learned Hierarchical Rank Pooling Networks
Basura Fernando ... Stephen Gould
International Journal of Computer Vision | VOL. 124
Basura Fernando, et. al.Basura Fernando ... Stephen Gould
24 Jun 2017
International Journal of Computer Vision | VOL. 124

Action Recognition with Dynamic Image Networks
Hakan Bilen ... Basura Fernando
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 40
Hakan Bilen, et. al.Hakan Bilen ... Basura Fernando
02 Nov 2017
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 40

Dynamic Image Networks for Action Recognition
Hakan Bilen ... Andrea Vedaldi
-
Hakan Bilen, et. al.Hakan Bilen ... Andrea Vedaldi
01 Jun 2016
01 Jun 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Rank pooling dynamic network: Learning end-to-end dynamic characteristic for action recognition

Abstract

Talk to us

Similar Papers

More From: Neurocomputing