Abstract

It is well-known that word frequencies arrange themselves according to Zipf's law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research. Our analysis of the evolution of Zipf's law yields two main unexpected results. First, in children the exponent of the law tends to decrease over time while this tendency is weaker in adults, thus suggesting this is not a mere mirror effect of adult speech. Second, although the exponent of the law is more stable in adults, their exponents fall below 1 which is the typical value of the exponent assumed in both children and adults. Our analysis also shows a tendency of the mean length of utterances (MLU), a simple estimate of syntactic complexity, to increase as the exponent decreases. The parallel evolution of the exponent and a simple indicator of syntactic complexity (MLU) supports the hypothesis that the exponent of Zipf's law and linguistic complexity are inter-related. The assumption that Zipf's law for word ranks is a power-law with a constant exponent of one in both adults and children needs to be revised.

Highlights

  • Word frequencies arrange themselves according to Zipf’s law [1,2]

  • The right-truncated zeta distribution was fitted to transcripts from longitudinal studies of child language from the CHILDES database [23]

  • The value of the exponent The dependency of a with time contradicts the assumption of a constant exponent and the value of the exponent itself

Read more

Summary

Introduction

Word frequencies arrange themselves according to Zipf’s law [1,2]. K. Zipf showed that if the most frequent word in a text is assigned rank 1, the second most frequent word is assigned rank 2, and so on, f (r),the frequency of a word of rank r obeys [1]. Zipf’s law can be formalized using a right-truncated zeta distribution [5]. Consider that ranks go from 1 to a certain maximum value rM. R is distributed according to a righttruncated zeta distribution if and only if the probability of a word of rank r is [5]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call