Proper Interpretation of Heaps' and Zipf's Laws

Kim Chol-Jun

doi:10.48550/arxiv.2305.15413

Proper Interpretation of Heaps' and Zipf's Laws

Kim Chol-Jun

https://doi.org/10.48550/arxiv.2305.15413

Copy DOI

Journal: arXiv (Cornell University)

Publication Date: May 5, 2023

#Distribution Of Words In Text #Zipf's Law + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We checked that the distribution of words in text should uniform, which gives Heaps' law as natural result, that is, the number of types of words can be expressed as a power law of the number of tokens within text. We developed a ``superposition'' model, which leads to an asymptotic power-law distribution of the number of occurrences (or frequency) of words, that is, Zipf's law. The model is well consistent with observations.

Full Text