Abstract

Recurrent Neural Networks (RNNs) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data. We demonstrate the power of RMN on language modeling and sentence completion tasks. On language modeling, RMN out-performs Long Short-Term Memory (LSTM) network on three large German, Italian, and English dataset. Additionally we perform in-depth analysis of various linguistic dimensions that RMN captures. On Sentence Completion Challenge, for which it is essential to capture sentence coherence, our RMN obtains 69.2% accuracy, surpassing the previous state of the art by a large margin.1

Highlights

  • Recurrent Neural Networks (RNNs) (Elman, 1990; Mikolov et al, 2010) are remarkably powerful models for sequential data

  • We propose Recurrent Memory Network (RMN), a novel RNN architecture that combines the strengths of both Long Short-Term Memory (LSTM) Proceedings of NAACL-HLT 2016, pages 321–331, San Diego, California, June 12-17, 2016. c 2016 Association for Computational Linguistics and Memory Network (Sukhbaatar et al, 2015)

  • We make the following contributions: 1. We propose a novel RNN architecture that complements LSTM in language modeling

Read more

Summary

Introduction

Recurrent Neural Networks (RNNs) (Elman, 1990; Mikolov et al, 2010) are remarkably powerful models for sequential data. Within the context of natural language processing, a common assumption is that LSTMs are able to capture certain linguistic phenomena Evidence supporting this assumption mainly comes from evaluating LSTMs in downstream applications: Bowman et al (2015) carefully design two artificial datasets where sentences have explicit recursive structures. They show empirically that while processing the input linearly, LSTMs can implicitly exploit recursive structures of languages. We perform an analysis along various linguistic dimensions that our model captures This is possible only because the Memory Block allows us to look into its internal states and its explicit use of additional inputs at each time step. On the Sentence Completion Challenge (Zweig and Burges, 2012), our model achieves an impressive 69.2% accuracy, surpassing the previous state of the art 58.9% by a large margin

Recurrent Neural Networks
Recurrent Memory Network
Memory Block
RMN Architectures
Language Model Experiments
Results
Attention Analysis
Positional and lexical analysis
Syntactic analysis
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call