Abstract

In spatio-temporal predictive coding problems, like next-frame prediction in video, determining the content of plausible future frames is primarily based on the image dynamics of previous frames. We establish an alternative approach based on their underlying semantic information when considering data that do not necessarily incorporate a temporal aspect, but instead they comply with some form of associative ordering. In this work, we introduce the notion of semantic predictive coding by proposing a novel generative adversarial modeling framework which incorporates the arbiter classifier as a new component. While the generator is primarily tasked with the anticipation of possible next frames, the arbiter’s principal role is the assessment of their credibility. Taking into account that the denotative meaning of each forthcoming element can be encapsulated in a generic label descriptive of its content, a classification loss is introduced along with the adversarial loss. As supported by our experimental findings in a next-digit and a next-letter scenario, the utilization of the arbiter not only results in an enhanced GAN performance, but it also broadens the network’s creative capabilities in terms of the diversity of the generated symbols.

Highlights

  • The recently discovered deep learning framework, Deep Neural Networks (DNNs), has revolutionized research in AI and machine learning setting the stage for major breakthroughs in a wide variety of scientific disciplines [1,2,3,4]

  • We proposed an alternative approach to the problem of next-frame prediction, termed semantic predictive coding

  • Instead of drawing inferences that are based on the spatial information and the temporal dynamics of past frames, we take advantage of the semantic information concealed in the data in an effort to contextually guess the element

Read more

Summary

Introduction

The recently (re-) discovered deep learning framework, Deep Neural Networks (DNNs), has revolutionized research in AI and machine learning setting the stage for major breakthroughs in a wide variety of scientific disciplines [1,2,3,4]. The proposed framework, termed Arbitrated Generative Adversarial Network (A-GAN), constitutes an indisputable distinction from currently existing works in the next-frame prediction literature in the fact that, while in traditional approaches, the adopted model attempts to guess the most likely future from all plausible outcomes, primarily guided by the image dynamics of the previous frames, in our case the prediction of each subsequent image is solely based on the deeper understanding and the well-aimed interpretation of the interconnected visual semantics of the input sequence. Such a service could utilize the synthesized images in order to retrieve similar available items Motivated by this novel perspective on the problem of next-frame prediction and, to the best of our knowledge, by the lack of relevant works that contemplate this certain approach, we utilize the cutting-edge deep learning methodology of Generative Adversarial Networks (GANs) [13] to effectively tackle the issue at hand.

Related Work
Proposed Methodology
Generative Adversarial Networks
Arbiter Network
The A-GAN Framework
Dataset Manipulation
Experimental Setup
Qualitative and Quantitative Results
Chains of Consecutive Predictions
Arbiter’s Loss Versus L2 Loss
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call