Can Recurrent Neural Networks Learn Nested Recursion?

Jean-Philippe Bernardy

doi:10.33011/lilt.v16i.1417

Abstract

 Context-free grammars (CFG) were one of the first formal tools used to model natural languages, and they remain relevant today as the basis of several frameworks. A key ingredient of CFG is the presence of nested recursion. In this paper, we investigate experimentally the capability of several recurrent neural networks (RNNs) to learn nested recursion. More precisely, we measure an upper bound of their capability to do so, by simplifying the task to learning a generalized Dyck language, namely one composed of matching parentheses of various kinds. To do so, we present the RNNs with a set of random strings having a given maximum nesting depth and test its ability to predict the kind of closing parenthesis when facing deeper nested strings. We report mixed results: when generalizing to deeper nesting levels, the accuracy of standard RNNs is significantly higher than random, but still far from perfect. Additionally, we propose some non-standard stack-based models which can approach perfect accuracy, at the cost of robustness.

Highlights

In many settings, Recurrent Neural Networks (RNNs) act as generative language models
The long short-term memory (LSTM) shows near perfect accuracy for all known strings
We have found out that, by and large, the LSTM is capable of generalization to new depths, with a respectable, yet not nearly perfect accuracy

Summary

Introduction

Recurrent Neural Networks (RNNs) act as generative language models. Popular RNNs functioning on these principles include the long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997), and the gated recurrent unit (GRU) (Cho et al, 2014) Thanks to their versatility, relative ease of training and ability to model long-term dependencies, RNNs have become the leading tool for natural language processing. Even experienced computational linguists use words such as “amazing” or even “magic” to describe them, betraying that it remains mysterious how, by performing arithmetic operations, the RNN can effectively mimic human linguistic production (Karpathy, 2016) This combination of poor understanding and enthusiasm may lead less experienced researchers into believing that the capabilities of LSTM RNNs are limitless, and that, with enough data, they can model any language you throw at them.

LSTM We use the variant of the LSTM RNN defined by the following equations:

Generalized-Dyck Language

Interpretation

Subtask A

Results

Subtask B

Related work

Learnability of depth recursion

Suitability of RNN variants

Deep recursion in natural language

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Linguistic Issues in Language Technology	Publication Date: Jul 1, 2018
Citations: 27	License type: cc-by

R Discovery Prime

R Discovery Prime

Can Recurrent Neural Networks Learn Nested Recursion?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Linguistic Issues in Language Technology

Lead the way for us

Similar Papers

Randomness is hard
H Buhrman ... L Torenvliet
-
H Buhrman, et. al.H Buhrman ... L Torenvliet
15 Jun 1998
15 Jun 1998

On the polynomial depth of various sets of random strings
Philippe Moser
Theoretical Computer Science | VOL. 477
Philippe MoserPhilippe Moser
03 Nov 2012
Theoretical Computer Science | VOL. 477

On the Polynomial Depth of Various Sets of Random Strings
Philippe Moser
-
Philippe MoserPhilippe Moser
01 Jan 2010
01 Jan 2010

A Lyapunov-stability-based context-layered recurrent pi-sigma neural network for the identification of nonlinear systems
Rajesh Kumar
Applied Soft Computing | VOL. 122
Rajesh KumarRajesh Kumar
18 Apr 2022
Applied Soft Computing | VOL. 122

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Can Recurrent Neural Networks Learn Nested Recursion?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Linguistic Issues in Language Technology