Explicit Duration Recurrent Networks.

Shun-Zheng Yu

doi:10.1109/tnnls.2021.3051019

Shun-Zheng Yu

Open Access

PDF Available

https://doi.org/10.1109/tnnls.2021.3051019

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Recurrent neural networks (RNNs) can be used to operate over sequences of vectors and have been successfully applied to a variety of problems. However, it is hard to use RNNs to model the variable dwell time of the hidden state underlying an input sequence. In this article, we interpret the typical RNNs, including original RNN, standard long short-term memory (LSTM), peephole LSTM, projected LSTM, and gated recurrent unit (GRU), using a slightly extended hidden Markov model (HMM). Based on this interpretation, we are motivated to propose a novel RNN, called explicit duration recurrent network (EDRN), analog to a hidden semi-Markov model (HSMM). It has a better performance than conventional LSTMs and can explicitly model any duration distribution function of the hidden state. The model parameters become interpretable and can be used to infer many other quantities that the conventional RNNs cannot obtain. Therefore, EDRN is expected to extend and enrich the applications of RNNs. The interpretation also suggests that the conventional RNNs, including LSTM and GRU, can be made small modifications to improve their performance without increasing the parameters of the networks.

Highlights

T HE recurrent neural networks (RNNs) have been successfully applied to various sequence learning problems, such as speech recognition, language modeling, translation, image captioning, health detection, remote sensing, and intelligent transportation
We have shown that the SE-hidden Markov model (HMM) can interpret the typical RNNs, including original RNN, standard long short-term memory (LSTM), peephole LSTM, PLSTM, and gated recurrent unit (GRU)
Similar to the SE-HMM being extended to the hidden semi-Markov model (HSMM) that can construct any PDF of state duration, the explicit duration recurrent network architecture, EDRN, can capture the varying length of the hidden state that governs the sequences

Summary

INTRODUCTION

T HE recurrent neural networks (RNNs) have been successfully applied to various sequence learning problems, such as speech recognition, language modeling, translation, image captioning, health detection, remote sensing, and intelligent transportation. Sahin and Kozat [10] incorporate the time gap between consecutive samples as a nonlinear scaling factor on the conventional gates of the classical LSTM network and use this extended network to process nonuniformly sampled variable-length sequential data This methodology cannot be extended for modeling unknown varying time information underlying the input time series. We use this framework to interpret and unify the typical RNNs. 2) We further extend the SE-HMM to a new HSMM that can construct any probability density function (PDF) of state duration Based on this HSMM, we propose a novel explicit duration RNN, called EDRN, that can capture varying periods of the underlying state that governs the input sequences. Small modifications to the standard LSTM and GRU can improve their performance without increasing the complexity of the networks

SLIGHT EXTENSION TO STANDARD HMM

Definition of the Model

Forward Recursion Formulas

HMM VIEW ON TYPICAL RNNS

HMM View on the Original RNN

HMM View on the Standard LSTM

HMM View on Peephole LSTM

HMM View on GRU

HMM View on the Projected LSTM

EXPLICIT DURATION RECURRENT NETWORKS

Definition of the Explicit Duration Recurrent Network

Complexity of EDRN

Inference From the Parameters

Constructing Any Parametric State Duration Distribution

EVALUATION

Outperforming PLSTM and LSTM

Variable Duration and Meaningful State

Modified GRU

CONCLUSION

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Neural Networks and Learning Systems	Publication Date: Jul 1, 2022
Citations: 10	License type: CC BY 4.0

R Discovery Prime

Explicit Duration Recurrent Networks.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems

Lead the way for us

Similar Papers

Comparison of Hybrid Recurrent Neural Networks for Univariate Time Series Forecasting
Anibal Flores ... Hugo Tito
-
Anibal Flores, et. al.Anibal Flores ... Hugo Tito
25 Aug 2020
25 Aug 2020

An Investigation into the Detection of Human Scratching Activity Based on Deep Learning Models
Kevin Wang
-
Kevin WangKevin Wang
28 Apr 2023
28 Apr 2023

Rotational Unit of Memory: A Novel Representation Unit for RNNs with Scalable Applications
Rumen Dangovski ... Li Jing
Transactions of the Association for Computational Linguistics | VOL. 7
Rumen Dangovski, et. al.Rumen Dangovski ... Li Jing
01 Nov 2019
Transactions of the Association for Computational Linguistics | VOL. 7

Analysis of Gradient Vanishing of RNNs and Performance Comparison
Seol-Hyun Noh
Information | VOL. 12
Seol-Hyun NohSeol-Hyun Noh
25 Oct 2021
Information | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Explicit Duration Recurrent Networks.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems