Robustness of Differentiable Neural Computer Using Limited Retention Vector-based Memory Deallocation in Language Model

Dong Hyun Lee ,Hosung Park ,Soonshin Seo ,Gyujin Kim ,Hyunsoo Son ,Ji-Hwan Kim

doi:10.3837/tiis.2021.03.002

Abstract

Recurrent neural network (RNN) architectures have been used for language modeling (LM) tasks that require learning long-range word or character sequences. However, the RNN architecture is still suffered from unstable gradients on long-range sequences. To address the issue of long-range sequences, an attention mechanism has been used, showing state-of-the-art (SOTA) performance in all LM tasks. A differentiable neural computer (DNC) is a deep learning architecture using an attention mechanism. The DNC architecture is a neural network augmented with a content-addressable external memory. However, in the write operation, some information unrelated to the input word remains in memory. Moreover, DNCs have been found to perform poorly with low numbers of weight parameters. Therefore, we propose a robust memory deallocation method using a limited retention vector. The limited retention vector determines whether the network increases or decreases its usage of information in external memory according to a threshold. We experimentally evaluate the robustness of a DNC implementing the proposed approach according to the size of the controller and external memory on the enwik8 LM task. When we decreased the number of weight parameters by 32.47%, the proposed DNC showed a low bits-per-character (BPC) degradation of 4.30%, demonstrating the effectiveness of our approach in language modeling tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robustness of Differentiable Neural Computer Using Limited Retention Vector-based Memory Deallocation in Language Model

Abstract

Talk to us

Similar Papers

More From: KSII Transactions on Internet and Information Systems

Lead the way for us

Journal: KSII Transactions on Internet and Information Systems	Publication Date: Mar 31, 2021
Citations: 3

Similar Papers

Language Model Using Differentiable Neural Computer Based on Forget Gate-Based Memory Deallocation
Donghyun Lee ... Hyunsoo Son
Computers, Materials & Continua | VOL. 68
Donghyun Lee, et. al.Donghyun Lee ... Hyunsoo Son
01 Jan 2020
Computers, Materials & Continua | VOL. 68

Hybrid computing using a neural network with dynamic external memory.
Alex Graves ...
Nature | VOL. 538
Alex Graves, et. al.Alex Graves ...
12 Oct 2016
Nature | VOL. 538

Recurrent Memory Networks for Language Modeling
Ke Tran ... Christof Monz
-
Ke Tran, et. al.Ke Tran ... Christof Monz
01 Jan 2015
01 Jan 2015

Toward Question-Answering with Multi-Hop Reasoning and Calculation over Knowledge Using a Neural Network Model with External Memories
Yuri Murayama ... Ichiro Kobayashi
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 27
Yuri Murayama, et. al.Yuri Murayama ... Ichiro Kobayashi
20 May 2023
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robustness of Differentiable Neural Computer Using Limited Retention Vector-based Memory Deallocation in Language Model

Abstract

Talk to us

Similar Papers

More From: KSII Transactions on Internet and Information Systems