Abstract

Large-scale empirical evidence indicates a fascinating statistical relationship between the estimated number of language users and its linguistic and statistical structure. In this context, the linguistic niche hypothesis argues that this relationship reflects a negative selection against morphological paradigms that are hard to learn for adults, because languages with a large number of speakers are assumed to be typically spoken and learned by greater proportions of adults. In this paper, this conjecture is tested empirically for more than 2000 languages. The results question the idea of the impact of non-native speakers on the grammatical and statistical structure of languages, as it is demonstrated that the relative proportion of non-native speakers does not significantly correlate with either morphological or information-theoretic complexity. While it thus seems that large numbers of adult learners/speakers do not affect the (grammatical or statistical) structure of a language, the results suggest that there is indeed a relationship between the number of speakers and (especially) information-theoretic complexity, i.e. entropy rates. A potential explanation for the observed relationship is discussed.

Highlights

  • In an influential and widely cited paper, Lupyan & Dale [1] present striking large-scale evidence for a statistical relationship between the estimated number of language users and structural properties of languages, especially several factors related to morphological complexity, e.g. the use of inflections to mark grammatical relationships in a sentence

  • In accordance with the linguistic niche hypothesis, this indicates that with ‘increased geographical spread and an increasing speaker population, a language is more likely to be subjected to learnability biases and limitations of adult learners’ [1, p. 7]

  • The results presented in this paper question the idea of an impact of non-native speakers on the grammatical and statistical structure of languages

Read more

Summary

Introduction

In an influential and widely cited paper, Lupyan & Dale [1] present striking large-scale evidence for a statistical relationship between the estimated number of language users and structural properties of languages, especially several factors related to morphological complexity, e.g. the use of inflections to mark grammatical relationships in a sentence. Lupyan & Dale [1] demonstrate that, for translations of a standard text (the Universal Declaration of Human Rights) into more than 100 different languages, languages with more speakers tend to be less informationally redundant. From an informationtheoretic point of view, data compression, i.e. the entropy rate of a source, can be interpreted as a measure of complexity [4]: the smaller the degree of redundancy in a string, the harder it is to predict subsequent text based on previous input [5], and the greater its complexity [6]. The results of [1] indicate that languages with more speakers are morphologically less complex, but informationally more complex This points towards a negative statistical association between morphological complexity and entropy rates

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.