Recurrent Memory Networks for Language Modeling

Ke Tran,Christof Monz,Arianna Bisazza

doi:10.18653/v1/n16-1036

Abstract

Recurrent Neural Networks (RNNs) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data. We demonstrate the power of RMN on language modeling and sentence completion tasks. On language modeling, RMN out-performs Long Short-Term Memory (LSTM) network on three large German, Italian, and English dataset. Additionally we perform in-depth analysis of various linguistic dimensions that RMN captures. On Sentence Completion Challenge, for which it is essential to capture sentence coherence, our RMN obtains 69.2% accuracy, surpassing the previous state of the art by a large margin.1

Highlights

Recurrent Neural Networks (RNNs) (Elman, 1990; Mikolov et al, 2010) are remarkably powerful models for sequential data
We propose Recurrent Memory Network (RMN), a novel RNN architecture that combines the strengths of both Long Short-Term Memory (LSTM) Proceedings of NAACL-HLT 2016, pages 321–331, San Diego, California, June 12-17, 2016. c 2016 Association for Computational Linguistics and Memory Network (Sukhbaatar et al, 2015)
We make the following contributions: 1. We propose a novel RNN architecture that complements LSTM in language modeling

Summary

Introduction

Recurrent Neural Networks (RNNs) (Elman, 1990; Mikolov et al, 2010) are remarkably powerful models for sequential data. Within the context of natural language processing, a common assumption is that LSTMs are able to capture certain linguistic phenomena Evidence supporting this assumption mainly comes from evaluating LSTMs in downstream applications: Bowman et al (2015) carefully design two artificial datasets where sentences have explicit recursive structures. They show empirically that while processing the input linearly, LSTMs can implicitly exploit recursive structures of languages. We perform an analysis along various linguistic dimensions that our model captures This is possible only because the Memory Block allows us to look into its internal states and its explicit use of additional inputs at each time step. On the Sentence Completion Challenge (Zweig and Burges, 2012), our model achieves an impressive 69.2% accuracy, surpassing the previous state of the art 58.9% by a large margin

Recurrent Neural Networks

Recurrent Memory Network

Memory Block

RMN Architectures

Language Model Experiments

Results

Attention Analysis

Positional and lexical analysis

Syntactic analysis

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Recurrent Memory Networks for Language Modeling

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2016
Citations: 90	License type: cc-by

Similar Papers

Progress in Neural Network Based Statistical Language Modeling
Anup Shrikant Kunte ... Vahida Z Attar
-
Anup Shrikant Kunte, et. al.Anup Shrikant Kunte ... Vahida Z Attar
30 Oct 2019
30 Oct 2019

Robustness of Differentiable Neural Computer Using Limited Retention Vector-based Memory Deallocation in Language Model
...
KSII Transactions on Internet and Information Systems | VOL. 15
, et. al. ...
31 Mar 2021
KSII Transactions on Internet and Information Systems | VOL. 15

Meta-heuristic based Optimized Deep Neural Network for Streaming Data Prediction
Puneet Kumar ... Shalini Batra
-
Puneet Kumar, et. al.Puneet Kumar ... Shalini Batra
01 Oct 2018
01 Oct 2018

Gated Recurrent Neural Tensor Network
Andros Tjandra ... Satoshi Nakamura
-
Andros Tjandra, et. al.Andros Tjandra ... Satoshi Nakamura
01 Jul 2016
01 Jul 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Recurrent Memory Networks for Language Modeling

Abstract

Highlights

Summary

Talk to us

Similar Papers