Cross-Linguistic Trade-Offs and Causal Relationships Between Cues to Grammatical Subject and Object, and the Problem of Efficiency-Related Explanations.

Natalia Levshina

doi:10.3389/fpsyg.2021.648200

Natalia Levshina

Open Access

https://doi.org/10.3389/fpsyg.2021.648200

Copy DOI

Abstract

Cross-linguistic studies focus on inverse correlations (trade-offs) between linguistic variables that reflect different cues to linguistic meanings. For example, if a language has no case marking, it is likely to rely on word order as a cue for identification of grammatical roles. Such inverse correlations are interpreted as manifestations of language users’ tendency to use language efficiently. The present study argues that this interpretation is problematic. Linguistic variables, such as the presence of case, or flexibility of word order, are aggregate properties, which do not represent the use of linguistic cues in context directly. Still, such variables can be useful for circumscribing the potential role of communicative efficiency in language evolution, if we move from cross-linguistic trade-offs to multivariate causal networks. This idea is illustrated by a case study of linguistic variables related to four types of Subject and Object cues: case marking, rigid word order of Subject and Object, tight semantics and verb-medial order. The variables are obtained from online language corpora in thirty languages, annotated with the Universal Dependencies. The causal model suggests that the relationships between the variables can be explained predominantly by sociolinguistic factors, leaving little space for a potential impact of efficient linguistic behavior.

Highlights

The strongest negative correlation is between case marking and rigid order of Subject and Object
This case study investigated the relationships between different cues that help the addressee to assign the grammatical roles of Subject and Object in a transitive clause
The measures that reflect the prominence of these cues were obtained from corpora in thirty languages

Summary

A CORRELATIONAL ANALYSIS OF CROSS-LINGUISTIC DATA

Computing correlations between the variables in this case study is not straightforward because the dataset contains dependent observations. In 1,000 simulations, I sampled only one language from each genus and computed the Spearman’s rank-based correlation coefficients for each sample. These coefficients were averaged for each pair of variables. In order to perform the null hypothesis significance testing, I computed and logged the test statistic for the original pairs of scores in every simulation. I ran 1,000 permutations, in which the original scores of the second variable were randomly reshuffled. The permutation scores represented the distribution of the test statistic under the null hypothesis. The p-values were averaged across the 1,000 samplings from the genera

Results of Correlational Analyses

Motivation for Causal Analysis

A Causal Network

A Possible Diachronic Scenario

CONCLUSION

DATA AVAILABILITY STATEMENT