Do neural nets learn statistical laws behind natural language?

Shuntaro Takahashi,Kumiko Tanaka-Ishii,Alejandro Raul Hernandez Montoya

doi:10.1371/journal.pone.0189326

Shuntaro Takahashi, Kumiko Tanaka-Ishii + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0189326

Copy DOI

Journal: PloS one	Publication Date: Dec 29, 2017
Citations: 25	License type: CC BY 4.0

Affiliation: The University of Tokyo

Abstract

The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf’s law and Heaps’ law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf’s law and Heaps’ law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.

Highlights

Deep learning has performed spectacularly in various natural language processing tasks such as machine translation [1], text summarization [2], dialogue systems [3], and question answering [4]
We have found that two well acknowledged statistical laws of natural language—Zipf’s law [12] and Heaps’ law [13] [14] [15]—almost hold for the pseudo-text generated by a neural language model
The stacked long short-term memory (LSTM) can reproduce the power-law behavior of the rank-frequency distribution of long n-grams. These results indicate that a neural language model can learn the statistical laws behind natural language, and that the stacked LSTM is especially capable of reproducing both patterns of n-grams and the properties of vocabulary growth

Summary

Introduction

Deep learning has performed spectacularly in various natural language processing tasks such as machine translation [1], text summarization [2], dialogue systems [3], and question answering [4]. We have found that two well acknowledged statistical laws of natural language—Zipf’s law [12] and Heaps’ law [13] [14] [15]—almost hold for the pseudo-text generated by a neural language model. This finding is notable because previous language models, such as Markov models, cannot reproduce such properties, and mathematical models, which are designed to reproduce statistical laws [16] [17], are limited in their purpose. The analyses described in this paper contribute to our understanding of the performance of neural networks and provide guidance as to how we can improve models

Neural language model

The Emergence of Zipf’s law and Heaps’ law

Neural language models are limited in reproducing long-range correlation

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Do neural nets learn statistical laws behind natural language?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Character-Level Neural Language Modelling in the Clinical Domain.
Markus Kreuzthaler ... Stefan Schulz
Studies in health technology and informatics | VOL. 270
Markus Kreuzthaler, et. al.Markus Kreuzthaler ... Stefan Schulz
16 Jun 2020
Studies in health technology and informatics | VOL. 270

Deep Learning for Natural Language Processing
Jiajun Zhang ... Chengqing Zong
-
Jiajun Zhang, et. al.Jiajun Zhang ... Chengqing Zong
01 Jan 2019
01 Jan 2019

Major-Minor Long Short-Term Memory for Word-Level Language Model.
Kai Shuang ... Sen Su
IEEE Transactions on Neural Networks and Learning Systems | VOL. 31
Kai Shuang, et. al.Kai Shuang ... Sen Su
05 Dec 2019
IEEE Transactions on Neural Networks and Learning Systems | VOL. 31

Lifelong Representation Learning for NLP Applications

-

25 Feb 2020
25 Feb 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Do neural nets learn statistical laws behind natural language?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one