Abstract

Gated recurrent units (GRUs) are specialized memory elements for building recurrent neural networks. Despite their incredible success on various tasks, including extracting dynamics underlying neural data, little is understood about the specific dynamics representable in a GRU network. As a result, it is both difficult to know a priori how successful a GRU network will perform on a given task, and also their capacity to mimic the underlying behavior of their biological counterparts. Using a continuous time analysis, we gain intuition on the inner workings of GRU networks. We restrict our presentation to low dimensions, allowing for a comprehensive visualization. We found a surprisingly rich repertoire of dynamical features that includes stable limit cycles (nonlinear oscillations), multi-stable dynamics with various topologies, and homoclinic bifurcations. At the same time we were unable to train GRU networks to produce continuous attractors, which are hypothesized to exist in biological neural networks. We contextualize the usefulness of different kinds of observed dynamics and support our claims experimentally.

Highlights

  • Recurrent neural networks (RNNs) can capture and utilize sequential structure in natural and artificial languages, speech, video, and various other forms of time series

  • While RNNs are traditionally implemented in discrete time, we show that this form of the Gated recurrent units (GRUs) can be interpreted as a numerical approximation of an underlying system of ordinary differential equations

  • Regarding the second non-local dynamical feature, it can be shown that a 2D GRU can undergo a homoclinic bifurcation, where a periodic orbit expands and collides with a saddle at the bifurcation point

Read more

Summary

INTRODUCTION

Recurrent neural networks (RNNs) can capture and utilize sequential structure in natural and artificial languages, speech, video, and various other forms of time series. Certain mechanistic tasks, unbounded counting, come easy to LSTM networks but not to GRU networks (Weiss et al, 2018) Despite these empirical findings, we lack systematic understanding of the internal time evolution of GRU’s memory structure and its capability to represent nonlinear temporal dynamics. We lack systematic understanding of the internal time evolution of GRU’s memory structure and its capability to represent nonlinear temporal dynamics Such an understanding will make clear what specific tasks (natural and artificial) can or cannot be performed (Bengio et al, 1994), how computation is implemented (Beer, 2006; Sussillo and Barak, 2012), and help to predict qualitative behavior (Beer, 1995; Zhao and Park, 2016). We recommend Meiss (Meiss, 2007) for more background on the subject

UNDERLYING CONTINUOUS TIME SYSTEM OF GATED RECURRENT UNITS
STABILITY ANALYSIS OF A ONE DIMENSIONAL GRU
ANALYSIS OF A TWO DIMENSIONAL GRU
EXPERIMENTS
Limit Cycle
DISCUSSION
Ring Attractor
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call