Video Captioning by Adversarial LSTM.

Yang Yang,Jie Zhou,Alan Hanjalic,Jiangbo Ai,Heng Tao Shen,Yanli Ji,Yi Bin

doi:10.1109/tip.2018.2855422

Abstract

In this paper, we propose a novel approach to video captioning based on adversarial learning and Long-Short Term Memory (LSTM). With this solution concept we aim at compensating for the deficiencies of LSTM-based video captioning methods that generally show potential to effectively handle temporal nature of video data when generating captions, but that also typically suffer from exponential error accumulation. Specifically, we adopt a standard Generative Adversarial Network (GAN) architecture, characterized by an interplay of two competing processes: a "generator", which generates textual sentences given the visual content of a video, and a "discriminator" which controls the accuracy of the generated sentences. The discriminator acts as an "adversary" towards the generator and with its controlling mechanism helps the generator to become more accurate. For the generator module, we take an existing video captioning concept using LSTM network. For the discriminator, we propose a novel realization specifically tuned for the video captioning problem and taking both the sentences and video features as input. This leads to our proposed LSTM-GAN system architecture, for which we show experimentally to significantly outperform the existing methods on standard public datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Video Captioning by Adversarial LSTM.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing

Lead the way for us

Journal: IEEE Transactions on Image Processing	Publication Date: Jul 12, 2018
Citations: 250

Similar Papers

A Deep Adversarial Learning Prognostics Model for Remaining Useful Life Prediction of Rolling Bearing
Bi-Liang Lu ... Zhao-Hua Liu
IEEE Transactions on Artificial Intelligence | VOL. 2
Bi-Liang Lu, et. al.Bi-Liang Lu ... Zhao-Hua Liu
01 Aug 2021
IEEE Transactions on Artificial Intelligence | VOL. 2

GAN-LSTM Predictor for Failure Prognostics of Rolling Element Bearings
Hao Lu ... Andrew T Zimmerman
-
Hao Lu, et. al.Hao Lu ... Andrew T Zimmerman
07 Jun 2021
07 Jun 2021

Video captioning using boosted and parallel Long Short-Term Memory networks
Masoomeh Nabati ... Alireza Behrad
Computer Vision and Image Understanding | VOL. 190
Masoomeh Nabati, et. al.Masoomeh Nabati ... Alireza Behrad
11 Oct 2019
Computer Vision and Image Understanding | VOL. 190

An LSTM Based Generative Adversarial Architecture for Robotic Calligraphy Learning System
Fei Chao ... Ling Zheng
Sustainability | VOL. 12
Fei Chao, et. al.Fei Chao ... Ling Zheng
31 Oct 2020
Sustainability | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Video Captioning by Adversarial LSTM.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing