Abstract

Can advances in NLP help advance cognitive modeling? We examine the role of artificial neural networks, the current state of the art in many common NLP tasks, by returning to a classic case study. In 1986, Rumelhart and McClelland famously introduced a neural architecture that learned to transduce English verb stems to their past tense forms. Shortly thereafter in 1988, Pinker and Prince presented a comprehensive rebuttal of many of Rumelhart and McClelland’s claims. Much of the force of their attack centered on the empirical inadequacy of the Rumelhart and McClelland model. Today, however, that model is severely outmoded. We show that the Encoder-Decoder network architectures used in modern NLP systems obviate most of Pinker and Prince’s criticisms without requiring any simplification of the past tense mapping problem. We suggest that the empirical performance of modern networks warrants a reexamination of their utility in linguistic and cognitive modeling.

Highlights

  • In their famous 1986 opus, Rumelhart and McClelland (R&M) describe a neural network capable of transducing English verb stems to their past tense

  • The neural network approaches we advocate for achieve this goal, but do not clearly fall into either the single or dualroute category—internal computations performed by each network remain opaque, so we cannot at present make a claim whether two separable computation paths are present

  • We evaluate the performance of the Encoder-Decoder network architecture (ED) architecture in light of the criticisms P&P levied against the original R&M model

Read more

Summary

Introduction

In their famous 1986 opus, Rumelhart and McClelland (R&M) describe a neural network capable of transducing English verb stems to their past tense. State-of-the art morphological generation networks used in NLP, built from the modern evolution of recurrent neural networks (RNNs) explored by Elman (1990) and others, solve the same problem almost perfectly (Cotterell et al, 2016) This level of performance on a cognitively relevant problem suggests that it is time to consider further incorporating network modeling into the study of linguistics and cognitive science. We focus instead on an empirical assessment of the ability of a modern state-of-the-art neural architecture to learn linguistic patterns, asking the following questions: (i) Does the learner induce the full set of correct generalizations about the data? The results suggest that modern nets absolutely meet the first criterion, and often meet the second They do this given limited prior knowledge of linguistic structure: The networks we consider do not have phonological features built into them and must instead learn their own representations for input phonemes.

The English Past Tense
Acquisition of the Past Tense
Encoder-Decoder Architectures
Related Work
Non-neural Learners
Evaluation of the ED Learner
Experiment 1
Results and Discussion
Experiment 2
Results
Summary of Resolved and Outstanding Criticisms
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.