Long short-term memory with activation on gradient

Chuan Qin,Liangming Chen,Zangtai Cai,Mei Liu,Long Jin

doi:10.1016/j.neunet.2023.04.026

Abstract

As the number of long short-term memory (LSTM) layers increases, vanishing/exploding gradient problems exacerbate and have a negative impact on the performance of the LSTM. In addition, the ill-conditioned problem occurs in the training process of LSTM and adversely affects its convergence. In this work, a simple and effective method of the gradient activation is applied to the LSTM, while empirical criteria for choosing gradient activation hyperparameters are found. Activating the gradient refers to modifying the gradient with a specific function named the gradient activation function. Moreover, different activation functions and different gradient operations are compared to prove that the gradient activation is effective on LSTM. Furthermore, comparative experiments are conducted, and their results show that the gradient activation alleviates the above problems and accelerates the convergence of the LSTM. The source code is publicly available at https://github.com/LongJin-lab/ACT-In-NLP.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Long short-term memory with activation on gradient

Abstract

Talk to us

Similar Papers

More From: Neural Networks

Lead the way for us

Journal: Neural Networks	Publication Date: Apr 25, 2023
Citations: 4

Similar Papers

Performance of Three Slim Variants of The Long Short-Term Memory (LSTM) Layer
Daniel Kent ... Fathi Salem
-
Daniel Kent, et. al.Daniel Kent ... Fathi Salem
01 Aug 2019
01 Aug 2019

Gujarati Task Oriented Dialogue Slot Tagging Using Deep Neural Network Models
Rachana Parikh ... Hiren Joshi
-
Rachana Parikh, et. al.Rachana Parikh ... Hiren Joshi
01 Jan 2020
01 Jan 2020

Share Market Prediction Using Long Short Term Memory and Artificial Neural Network
J.Aruna Jasmine ... S.Susila Sakthy
-
J.Aruna Jasmine, et. al.J.Aruna Jasmine ... S.Susila Sakthy
16 Dec 2021
16 Dec 2021

Novel neural network architecture for energy prediction
Hae Jin Kim ... Panos P Markopoulos
-
Hae Jin Kim, et. al.Hae Jin Kim ... Panos P Markopoulos
31 May 2022
31 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Long short-term memory with activation on gradient

Abstract

Talk to us

Similar Papers

More From: Neural Networks