Abstract

We investigate how short and long-range word length correlations evolve in Chinese narrative texts. The results show that, for short-range word length correlations, no significant linear evolutionary trend was found. But for long-range correlations, there are two opposite tendencies for two different regimes: the Hurst exponent of small-scale (box size n ranges from 10 to 100) word length correlations decreases over time, and the exponent of large-scale (box size n ranges from 101 to 1000) shows an increasing tendency. The increase of word length is corroborated as an essential regularity of word evolution in written Chinese. Further analyses show that a significant correlation coefficient is obtained between Hurst exponents from the small-scale correlations and mean word length across time. These indicate that word length correlation evolution possesses different self-adaptive mechanisms in terms of different scales of distances between words. We speculate that the increase of word length and sentence length in written Chinese may account for this phenomenon, in terms of both the social-cultural aspects and the self-adapting properties of language structures.

Highlights

  • As a result of human evolution, [1, 2] language is closely related to the evolution of human physical being and the increasing need for effective communication [3]

  • The results of short/long-range word length correlations are given

  • We found that word length distributions of the six periods largely differ in the tail, that is, the distribution of long words

Read more

Summary

Introduction

As a result of human evolution, [1, 2] language is closely related to the evolution of human physical being and the increasing need for effective communication [3]. Many studies show that language can be usefully described as a complex system [4,5,6,7], with hierarchical structure in terms of syntactic organization [8, 9], from morphemes to words, phrases, and sentences. Word is the fundamental unit of language, which is arranged and structured, according to syntactic principles, to form phrases, sentence, and texts [8]. Lexical features may reflect linguistic properties at the level of words, and throw light on syntactic patterns at the level of phrases and sentences. Kohler [15] points out that word length may reflect the properties of its basic language units-words. There are several methods to investigate into word length in sequences, including word length entropies [19, 20] (Papadimitriou, 2010; Grotjahn, 1979), word length correlations [21, 22], word length repetitions [23], and the latest word length motifs [15, 18, 24]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call