Abstract

In linguistic studies, the academic level of the vocabulary in a text can be described in terms of statistical physics by using a “temperature” concept related to the text's word-frequency distribution. We propose a “comparative thermo-linguistic” technique to analyze the vocabulary of a text to determine its academic level and its target readership in any given language. We apply this technique to a large number of books by several authors and examine how the vocabulary of a text changes when it is translated from one language to another. Unlike the uniform results produced using the Zipf law, using our “word energy” distribution technique we find variations in the power-law behavior. We also examine some common features that span across languages and identify some intriguing questions concerning how to determine when a text is suitable for its intended readership.

Highlights

  • Scaling laws have been an important topic in the physics community across a wide range of fields [1,2,3]

  • Zipf [17] described another typical example of a power law in data on human behavior

  • Tional textbooks in the English language. They found that the higher the vocabulary grade level of a textbook, the lower its temperature. They found, for example, that the temperature of English textbooks for grades K1 through K12 in the US educational system decreases from 1.48 K to 0.87 K when the 1.00 K temperature of the American National Corpus (ANC) is used as a standard

Read more

Summary

Introduction

Scaling laws have been an important topic in the physics community across a wide range of fields [1,2,3]. Zipf [17] described another typical example of a power law in data on human behavior. He proposed that the distribution of the effort of both speakers and listeners as they attempt to optimize their communication produces a distinctive distribution, the well-known Zipf Law. Recent research has analyzed how the Zipf scaling of the word frequency distribution changes over the centuries [15], and how this change is affected by both social and natural phenomena [16]. As is the case for many other scaling laws, the Zipf law can be used in the statistical analysis of huge data sets from other systems [12,18,19,20], e.g., the distribution of wealth and income in a given population [21] or the distribution of family names [22]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.