New RNN Activation Technique for Deeper Networks: LSTCM Cells

Soo-Han Kang,Ji-Hyeong Han

doi:10.1109/access.2020.3040405

Soo-Han Kang, Ji-Hyeong Han

Open Access

https://doi.org/10.1109/access.2020.3040405

Copy DOI

Abstract

Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.

Highlights

R ECENTLY, deep learning approaches including feed-forward networks, convolution neural networks (CNNs), and recurrent neural networks (RNNs) have shown good performance in many fields
On a language modeling task based on the Penn Treebank (PTB) dataset, the difference between the perplexity level between the training and the test datasets was less when the independently long shortterm memory (ILSTM) and ILSTCM were applied as compared to when long short-term memory (LSTM) and long short-time complex memory (LSTCM) were applied, meaning that the independent concept prevents the network overfitting problem
This paper proposed what is termed a long short-time complex memory (LSTCM) cell to solve the gradient vanishing problem in recurrent neural networks (RNNs) and long shortterm memory (LSTM), especially when the network is deep

Summary

INTRODUCTION

R ECENTLY, deep learning approaches including feed-forward networks, convolution neural networks (CNNs), and recurrent neural networks (RNNs) have shown good performance in many fields. The basic approach to solving complex problems with deep learning is to create a deeper network or a more complex network This is true, in RNN research; i.e., the stacking of multiple recurrent layers or the use of more complex cells, such as long short-term memory (LSTM) [10], gated recurrent unit (GRU) [11] and neural architecture search (NAS) [12] cells. The proposed cell is referred to as long shorttime complex memory (LSTCM) With this new activation technique, the proposed LSTCM cell reduces the gradient vanishing problem in the layers, creating and training a deeper network for complex problems.

METHODS

BACKPROPAGATION THROUGH TIME IN LSTM

THE PROPOSED LSTCM CELL

RECURRENT WEIGHT

USING AN ACTIVATION FUNCTION WITH A SINUSOIDAL FUNCTION

LANGUAGE MODELING TASK

WEIGHT INITIALIZATION IN LSTCM

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

New RNN Activation Technique for Deeper Networks: LSTCM Cells

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Efficient Implementation of Activation Functions for LSTM accelerators
Yi Sheng Chong ... Anh Tuan Do
-
Yi Sheng Chong, et. al.Yi Sheng Chong ... Anh Tuan Do
04 Oct 2021
04 Oct 2021

Developing Novel Activation Functions Based Deep Learning LSTM for Classification
Mohamed H Essai Ali ... Eman A Badry
IEEE Access | VOL. 10
Mohamed H Essai Ali, et. al.Mohamed H Essai Ali ... Eman A Badry
01 Jan 2021
IEEE Access | VOL. 10

Determining the Best Activation Functions for Predicting Stock Prices in Different (Stock Exchanges) Through Multivariable Time Series Forecasting of LSTM
-
Australian Journal of Engineering and Innovative Technology | VOL. -
--
03 Apr 2023
Australian Journal of Engineering and Innovative Technology | VOL. -

A comparative performance analysis of different activation functions in LSTM networks for classification
Amir Farzad ... Hamid Hassanpour
Neural Computing and Applications | VOL. 31
Amir Farzad, et. al.Amir Farzad ... Hamid Hassanpour
19 Oct 2017
Neural Computing and Applications | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

New RNN Activation Technique for Deeper Networks: LSTCM Cells

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access