Abstract

Cross-linguistic studies focus on inverse correlations (trade-offs) between linguistic variables that reflect different cues to linguistic meanings. For example, if a language has no case marking, it is likely to rely on word order as a cue for identification of grammatical roles. Such inverse correlations are interpreted as manifestations of language users’ tendency to use language efficiently. The present study argues that this interpretation is problematic. Linguistic variables, such as the presence of case, or flexibility of word order, are aggregate properties, which do not represent the use of linguistic cues in context directly. Still, such variables can be useful for circumscribing the potential role of communicative efficiency in language evolution, if we move from cross-linguistic trade-offs to multivariate causal networks. This idea is illustrated by a case study of linguistic variables related to four types of Subject and Object cues: case marking, rigid word order of Subject and Object, tight semantics and verb-medial order. The variables are obtained from online language corpora in thirty languages, annotated with the Universal Dependencies. The causal model suggests that the relationships between the variables can be explained predominantly by sociolinguistic factors, leaving little space for a potential impact of efficient linguistic behavior.

Highlights

  • The strongest negative correlation is between case marking and rigid order of Subject and Object

  • This case study investigated the relationships between different cues that help the addressee to assign the grammatical roles of Subject and Object in a transitive clause

  • The measures that reflect the prominence of these cues were obtained from corpora in thirty languages

Read more

Summary

A CORRELATIONAL ANALYSIS OF CROSS-LINGUISTIC DATA

Computing correlations between the variables in this case study is not straightforward because the dataset contains dependent observations. In 1,000 simulations, I sampled only one language from each genus and computed the Spearman’s rank-based correlation coefficients for each sample. These coefficients were averaged for each pair of variables. In order to perform the null hypothesis significance testing, I computed and logged the test statistic for the original pairs of scores in every simulation. I ran 1,000 permutations, in which the original scores of the second variable were randomly reshuffled. The permutation scores represented the distribution of the test statistic under the null hypothesis. The p-values were averaged across the 1,000 samplings from the genera

Results of Correlational Analyses
Motivation for Causal Analysis
A Causal Network
A Possible Diachronic Scenario
CONCLUSION
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call