Multimedia Data Modelling Using Multidimensional Recurrent Neural Networks

Zhen He,Liang Xiao,Hangen He,Shaobing Gao,Daxue Liu

doi:10.3390/sym10090370

Abstract

Modelling the multimedia data such as text, images, or videos usually involves the analysis, prediction, or reconstruction of them. The recurrent neural network (RNN) is a powerful machine learning approach to modelling these data in a recursive way. As a variant, the long short-term memory (LSTM) extends the RNN with the ability to remember information for longer. Whilst one can increase the capacity of LSTM by widening or adding layers, additional parameters and runtime are usually required, which could make learning harder. We therefore propose a Tensor LSTM where the hidden states are tensorised as multidimensional arrays (tensors) and updated through a cross-layer convolution. As parameters are spatially shared within the tensor, we can efficiently widen the model without extra parameters by increasing the tensorised size; as deep computations of each time step are absorbed by temporal computations of the time series, we can implicitly deepen the model with little extra runtime by delaying the output. We show by experiments that our model is well-suited for various multimedia data modelling tasks, including text generation, text calculation, image classification, and video prediction.

Highlights

Multimedia data such as text, images, and videos are ubiquitous nowadays
We find that in the task of addition, 2-Tensor LSTM (tLSTM)+channel normalisation (CN) of L = 7 performs the best and solves the task using only 298 K
We have aimed to deal with multimedia modelling tasks

Summary

Introduction

Multimedia data such as text, images, and videos are ubiquitous nowadays. Modelling such data usually involves the analysis, prediction, or reconstruction of them. Text modelling relates to many natural language processing tasks such as sentiment analysis [1], part-of-speech tagging [2], machine translation [3], and question answering [4], image modelling relates to many computer vision tasks such as image segmentation [5], depth reconstruction [6], image generation [7], and super-resolution [8], and video modelling relates to many computer vision tasks such as object tracking [9], video segmentation [10], motion estimation [11], and video prediction [12] They are diverse, these tasks usually can be formulated as a time series prediction problem, e.g., generating a desired output yt for a given time series x1:t = { x1 , x2 , · · · , xt }, for time t = 1, 2, .

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Sep 1, 2018
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multimedia Data Modelling Using Multidimensional Recurrent Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Analysis of Gradient Vanishing of RNNs and Performance Comparison
Seol-Hyun Noh
Information | VOL. 12
Seol-Hyun NohSeol-Hyun Noh
25 Oct 2021
Information | VOL. 12

Image Classification using a Hybrid LSTM-CNN Deep Neural Network
Aditi* ... E Poovammal*
International Journal of Engineering and Advanced Technology | VOL. 8
Aditi*, et. al. Aditi* ... E Poovammal*
30 Aug 2019
International Journal of Engineering and Advanced Technology | VOL. 8

Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network
Alex Sherstinsky
Physica D: Nonlinear Phenomena | VOL. 404
Alex SherstinskyAlex Sherstinsky
21 Jan 2020
Physica D: Nonlinear Phenomena | VOL. 404

Share Market Prediction Using Long Short Term Memory and Artificial Neural Network
J.Aruna Jasmine ... S.Susila Sakthy
-
J.Aruna Jasmine, et. al.J.Aruna Jasmine ... S.Susila Sakthy
16 Dec 2021
16 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multimedia Data Modelling Using Multidimensional Recurrent Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry