VIDEOWHISPER: Towards unsupervised learning of discriminative features of videos with RNN

Na Zhao,Mingxing Zhang,Richang Hong,Hanwang Zhang,Meng Wang,Tat-Seng Chua

doi:10.1109/icme.2017.8019344

Abstract

We present VidedWhisfer, a novel approach for unsupervised video representation learning, in which video sequence is treated as a self-supervision entity based on the observation that the sequence encodes video temporal dynamics (e.g., object movement and event evolution). Specifically, for each video sequence, we use a pre-learned visual dictionary to generate a sequence of high-level semantics, dubbed “whisper”, which encodes both visual contents at the frame level and visual dynamics at the sequence level. VidedWhisfer is driven by a novel “sequence-to-whisper” learning strategy. Naturally, an end-to-end sequence-to-sequence learning model using RNN is modeled and trained to predict the whisper sequence. We propose two ways to generate video representation from the model. Through extensive experiments we demonstrate that video representation learned by VidedWhisfer is effective to boost fundamental video-related applications such as video retrieval and classification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

VIDEOWHISPER: Towards unsupervised learning of discriminative features of videos with RNN

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

VideoWhisper: Toward Discriminative Unsupervised Video Feature Learning With Attention-Based Recurrent Neural Networks
Na Zhao ... Richang Hong
IEEE Transactions on Multimedia | VOL. 19
Na Zhao, et. al.Na Zhao ... Richang Hong
01 Sep 2017
IEEE Transactions on Multimedia | VOL. 19

VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples
Tian Pan ... Wei Liu
-
Tian Pan, et. al.Tian Pan ... Wei Liu
01 Jun 2021
01 Jun 2021

Learning Unsupervised Visual Representations using 3D Convolutional Autoencoder with Temporal Contrastive Modeling for Video Retrieval
Vidit Kumar ... Vikas Tripathi
International Journal of Mathematical, Engineering and Management Sciences | VOL. 7
Vidit Kumar, et. al.Vidit Kumar ... Vikas Tripathi
14 Mar 2022
International Journal of Mathematical, Engineering and Management Sciences | VOL. 7

Enhancing Unsupervised Video Representation Learning by Temporal Contrastive Modelling Using 2D CNN
Vidit Kumar ... Bhaskar Pant
-
Vidit Kumar, et. al.Vidit Kumar ... Bhaskar Pant
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VIDEOWHISPER: Towards unsupervised learning of discriminative features of videos with RNN

Abstract

Talk to us

Similar Papers