Abstract

In his pioneering research, G.K. Zipf observed that more frequent words tend to have more meanings, and showed that the number of meanings of a word grows as the square root of its frequency. He derived this relationship from two assumptions: that words follow Zipf's law for word frequencies (a power law dependency between frequency and rank) and Zipf's law of meaning distribution (a power law dependency between number of meanings and rank). Here we show that a single assumption on the joint probability of a word and a meaning suffices to infer Zipf's meaning‐frequency law or relaxed versions. Interestingly, this assumption can be justified as the outcome of a biased random walk in the process of mental exploration.

Highlights

  • G.K. Zipf (1949) investigated many statistical regularities of language

  • If pðsi; rjÞ is regarded as the weight of the association between si and rj, it defines the general form of the relationship between the weight of an edge and the product of the degrees of vertices at both ends that is found in real networks (Barrat, Barthelemy, Pastor-Satorras, & Vespignani, 2004)

  • We have seen that a relaxed version of the law (Equation 7) can be obtained from Equation 15 without making any further assumption

Read more

Summary

Introduction

G.K. Zipf (1949) investigated many statistical regularities of language. Some of them have been investigated intensively, such as Zipf’s law for word frequencies (Fedorowicz, 1982; Ferrer-i-Cancho, & Gavalda, 2009; Font-Clos, Boleda, & Corral, 2013; Ferrer-i-Cancho, 2016a) or Zipf’s law of abbreviation (Strauss, Grzybek, & Altmann, 2006; Ferrer-i-Cancho et al, 2013). We will present a minimalist derivation of the meaning-frequency law (Equation 1) with d51=2 law that is based on just one assumption on the joint probability of a word and a meaning. This assumption is a more elegant solution for two reasons: it corrects the arbitrariness of the assumption of the minimalist derivation, fits into standard network theory, and it can be embedded into a general theory of communication From this deeper assumption we derive the meaning-frequency law following three major paths. The third path makes no assumption to obtain a relaxed version of the meaning-frequency law, namely, the number of meanings is bounded above and below by two power-laws over f, that is b1f d l b2f d; where b1 and b2 are constants such that b1 b2. We will discuss the results, highlighting the connection with biased random walks, and indicate directions for future research

A Mathematical Framework
Xn Xm 5 Xn : aijli li aij l2i
A Theoretical Derivation of the Law
A Family of Optimization Models of Communication
Findings
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call