Abstract

Zipf's law states that if words of a language are sorted in the order of decreasing frequency of usage, a word's frequency is inversely proportional to its rank, or sequence number in the list. The Zipf-Mandelbrot law is a more general formula that provides a better fit in the low-rank region. Among several models aimed at explaining this effect, Mandelbrot's model is one of the best known. It derives Zipf's law as a result of the optimization of information/cost ratio, but leads to an unrealistic view of texts as random character sequences. In this article, a new modification of the model is proposed that is free from this drawback and allows the optimal information/cost ratio to be achieved via language evolution. It is demonstrated that the Zipf-Mandelbrot formula follows from this model, but its two parameters are not independent. As a result, the formula cannot convincingly be fitted to the actual word frequency distributions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.