Modeling Long-Term Dependencies from Videos Using Deep Multiplicative Neural Networks

Wen Si,Cong Liu,Zhongqin Bi,Meijing Shan

doi:10.1145/3357797

Abstract

Understanding temporal dependencies of videos is fundamental for vision problems, but deep learning–based models are still insufficient in this field. In this article, we propose a novel deep multiplicative neural network (DMNN) for learning hierarchical long-term representations from video. The DMNN is built upon the multiplicative block that remembers the pairwise transformations between consecutive frames using multiplicative interactions rather than the regular weighted-sum ones. The block is slided over the timesteps to update the memory of the networks on the frame pairs. Deep architecture can be implemented by stacking multiple layers of the sliding blocks. The multiplicative interactions lead to exact, rather than approximate, modeling of temporal dependencies. The memory mechanism can remember the temporal dependencies for an arbitrary length of time. The multiple layers output multiple-level representations that reflect the multi-timescale structure of video. Moreover, to address the difficulty of training DMNNs, we derive a theoretically sound convergent method, which leads to a fast and stable convergence. We demonstrate a new state-of-the-art classification performance with proposed networks on the UCF101 dataset and the effectiveness of capturing complicate temporal dependencies on a variety of synthetic datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Modeling Long-Term Dependencies from Videos Using Deep Multiplicative Neural Networks

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications

Lead the way for us

Journal: ACM Transactions on Multimedia Computing, Communications, and Applications	Publication Date: Apr 30, 2020
Citations: 2

Similar Papers

Long-term sequence dependency capture for spatiotemporal graph modeling
Longji Huang ... Jiangtao Cui
Knowledge-Based Systems | VOL. 278
Longji Huang, et. al.Longji Huang ... Jiangtao Cui
26 Jul 2023
Knowledge-Based Systems | VOL. 278

Recognition of Real-Time Video Activities Using Stacked Bi-GRU with Fusion-based Deep Architecture
Ujwala Thakur ... Amarjeet Prajapati
JUCS - Journal of Universal Computer Science | VOL. 30
Ujwala Thakur, et. al.Ujwala Thakur ... Amarjeet Prajapati
28 Sep 2024
JUCS - Journal of Universal Computer Science | VOL. 30

SAM-Net: Spatio-Temporal Sequence Typhoon Cloud Image Prediction Net with Self-Attention Memory
Yanzhao Ren ... Ruijun Liu
Remote Sensing | VOL. 16
Yanzhao Ren, et. al.Yanzhao Ren ... Ruijun Liu
12 Nov 2024
Remote Sensing | VOL. 16

Monaural Speech Enhancement Using Intra-Spectral Recurrent Layers in the Magnitude and Phase Responses
Khandokar Md Nayem ... Donald S Williamson
-
Khandokar Md Nayem, et. al.Khandokar Md Nayem ... Donald S Williamson
01 May 2020
01 May 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modeling Long-Term Dependencies from Videos Using Deep Multiplicative Neural Networks

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Multimedia Computing, Communications, and Applications