The paradox of SOV: A case for token-based typology
This study addresses a paradox in word order typology. On the one hand, the SOV order has longer dependency distances and therefore higher processing costs compared to verb-medial order. On the other hand, it is the most frequent word order in languages of the world. How come? A study of corpus data annotated with Universal Dependencies provides a simple an-swer: the costly long distances occur more rarely than one would assume because SOV clauses are infrequent in language use. A quanitative analysis of 150 Universal Dependencies corpora shows that the proportions of verb-final clauses with two overt core arguments are low across languages, including predominantly verb-final languages. Moreover, a series of Bayesian phy-logenetic models based on comparable corpora in thirty-two languages show a negative corre-lation between the proportion of verb-final clauses in a language and the average number of arguments in a clause, while controlling for argument indexing and high- and low-context cul-ture. A closer examination of argument configurations reveals a positive correlation between proportions of verb-final clauses and proportions of subjectless clauses; as for proportions of objectless clauses, the evidence is less clear. The study highlights the importance of the token-based, gradient approach to typology, which gives us insights into what kind of structures language users prefer, and what they avoid.
- Preprint Article
- 10.31234/osf.io/wfbpv
- Jan 10, 2025
This study addresses a paradox in word order typology. On the one hand, the SOV order has longer dependency distances and therefore higher processing costs compared to verb-medial order. On the other hand, it is the most frequent word order in languages of the world. How come? An analysis of large-scale corpus data in thirty-two languages annotated with Universal Dependencies provides a simple answer: the costly long distances occur more rarely than one would assume because verb-final languages usually have fewer arguments compared to verb-medial languages. A series of Bayesian phylogenetic models shows a negative correlation between the proportion of verb-final clauses in a language and the average number of arguments in a clause, while controlling for argument indexing and high- and low-context culture. A closer examination of argument configurations reveals a positive correlation between proportions of verb-final clauses and proportions of subjectless clauses; as for proportions of objectless clauses, the evidence is less clear. In addition, a quanitative analysis of 150 Universal Dependencies corpora shows that the proportions of verb-final clauses with two overt arguments are low, even in verb-final languages. The study highlights the importance of the token-based, gradient approach to typology, which gives us insights into what kind of structures language users prefer, and what they avoid.
- Dissertation
41
- 10.17077/etd.11qhthvg
- Jan 21, 2009
The dissertation explores word order phenomena in a ‘free’ word order language, Russian. It has been proposed in the literature that in simple sentences like ‘John sees Mary’, six word orders are equally possible in Russian. The dissertation questions the equal acceptability of these word orders and shows that some of the “felicitous” word orders have a degraded status compared to others. The word order findings are based on experimental evidence from elicitation, perception and grammaticality judgment psycholinguistic studies with 237 adult native speakers of Russian. The results of the experiments demonstrate that Russian speakers have a strong preference for producing some word orders over others. For example, Russian native speakers produce transitive SVO, OVS and SOV felicitous word orders, but consistently do not produce VSO, VOS and OSV felicitous word orders, which they still recognize as acceptable, but as having a degraded grammaticality status. On the basis of the experimental evidence and analysis of the various constituent movements within the Minimalist Program approach, a model of grammar is proposed which adds a pragmatic component responsible for word order permutations. According to this model, the syntactic component of grammar generates only SVO sentences (the basic word order) in Russian. All discourse-dependent sentences result from realignment in the post-syntactic pragmatic component. In contrast to the hierarchical structure of syntax, the pragmatic component of grammar has a linear structure and operates with Optimality Theory-type constraints determining the optimal output word order in a particular discourse structure. The underlying assumption of this model is that this pragmatic component is present in all languages. However, the language specific ranking of the constraints in this component results in word order variations. In contrast to the previous structural approaches to word order permutations in Russian, the proposed model has obvious advantages. The model accounts not only for grammaticality and
- Book Chapter
5
- 10.1093/oso/9780198163251.003.0013
- May 1, 1997
In this chapter, I investigate the interaction between centering theory and word order in a ‘free’ word order language, i.e., Turkish. In “free’ word order languages, e.g., Czech. Finnish. German, Hindi, Hungarian, Japanese, Korean, Polish. Russian, Turkish, Urdu, the word order serves to structure the information being conveyed to the hearer by indicating what is the topic and the focus of the sentence. These discourse-related notions, which will be defined more thoroughly in the next sections, form the information structure of the sentence. I will argue that centering and information struc ture have different roles in discourse processing. The information structure of a sentence instructs the hearer on how to update his/her discourse model with the information in the sentence, while centering serves to link the sentence to the prior context.
- Single Book
98
- 10.1075/cilt.25
- Jan 1, 1983
This monograph, discussing various aspects involved with a typology of word order, strives to take a next step towards a better understanding of the profound unity underlying languages. The volume is divided into five sections: 1) Word order typology; 2) A critical analysis of word order typology; 3) Word order within comparative constructions; 4) Word order in the comparative construction in the Rigveda; 5) Diachronic aspects of word order withing comparative constructions.
- Research Article
13
- 10.1023/a:1024198104514
- Aug 1, 2003
- Natural Language & Linguistic Theory
In many 'free' word order languages, it is not uncommon to findfixed word order phenomena in which a certain canonical word orderbecomes fixed under special circumstances. This phenomenon is termedword order freezing. The central dynamic in word order freezing is hierarchy alignment. In Hindi and Korean, for instance, word orderis free when the unmarked association among grammatical functions,semantic roles, case and positions in phrase structure matches therelative prominence relations of these dimensions. However, free wordorder becomes fixed when there is more than onemarked association of elements in these different dimensions of prominencein a single clause. This preference for avoiding highly markedassociations of prominence hierarchies in word order is a case of themore general phenomenon 'markedness reduction' in typologically markedgrammatical contexts. This paper develops an approach to word ordervariation in two scrambling languages, Hindi and Korean, withinOptimality Theory that is capable of subsuming both the free orderingand fixed ordering of constituents under the universal theory ofmarkedness. It accounts in a uniform manner for the universal basis ofword order freezing, while at the same time allowing for the range ofcrosslinguistic and language-internal variation that is observed.
- Research Article
28
- 10.1075/bjl.4.05dry
- Jan 1, 1989
- Belgian Journal of Linguistics
Preview this article: Discourse-Governed Word Order and Word Order Typology, Page 1 of 1 < Previous page | Next page > /docserver/preview/fulltext/bjl.4.05dry-1.gif
- Book Chapter
6
- 10.1093/oso/9780198236870.003.0013
- Dec 18, 1997
This chapter investigates the interaction between centering and word or der in a ‘free’ word order language, Turkish. Word order in Turkish is used to express the information structure of a sentence, i.e., pragmatic notions such as topic, focus, and backgrounding. In a corpus study, I show that the Cb is often placed in the sentence-initial topic position in Turkish regard less of whether this topic is the subject or a scrambled object. However, l argue that centering and information structure play different roles in dis course processing.
- Research Article
23
- 10.1111/j.1467-1770.1986.tb00377.x
- Jun 1, 1986
- Language Learning
Lexical government refers to the relationship between a phrasal head and its complement. In this paper it is used to define a centre and periphery in word order typology. The direction of the government relation gives rise to two word order types. It is proposed that grammars in which the phrasal heads show a major split in their direction of government are more marked than those with a uniform direction.This framework serves to generate multiple, graded predictions about word order in non‐primary acquisition and the predictions are tested on a broad range of available L2 word order data. The investigation indicates that while L2 learners do have access to the defining principle, they may not be as successful as L1 learners in acquiring peripheral word order attributes and word orders with a split in the direction of government.
- Research Article
10
- 10.1016/0388-0001(95)00010-s
- Apr 1, 1995
- Language Sciences
Communicative dynamism and word order in Mandarin Chinese
- Conference Article
8
- 10.3115/1117840.1117850
- Jan 1, 2001
We propose a multilingual approach to characterizing word order at the clause level as a means to realize information structure. We illustrate the problem with three languages which differ in the degree of word order freedom they exhibit: Czech, a free word order language in which word order variation is pragmatically determined; English, a fixed word order language in which word order is primarily grammatically determined; and German, a language which is between Czech and English on the scale of word order freedom. Our work is theoretically rooted in previous work on information structuring and word order in the Prague School framework as well as on the systemic-functional notion of Theme. The approach we present has been implemented in KPML.
- Research Article
- 10.17507/tpls.1509.10
- Sep 3, 2025
- Theory and Practice in Language Studies
Languages differ in how they convey prominence and information structure (IS). In Modern Standard Arabic (MSA), a flexible word order language, new information focus is marked by accent, and contrastive focus by word order displacement (Moutaouakil, 1989). Traditionally, rigid word order languages rely on prosody, while flexible word order languages employ syntactic movement (Donati & Nespor, 2003; Cole, 2015). However, the combined use of word order and prosody to mark prominence in flexible-word order languages has not been well studied. This study investigates the interaction between syntactic and prosodic strategies in two Saudi Arabic varieties (Hijazi and Najdi) to determine whether prosody complements or merely replicates the function of word order. A production task with 12 Saudi speakers elicited responses that varied in word order, focus type, and prosodic marking. Acoustic analysis revealed that focused elements exhibited longer vowel durations and wider F0 ranges than non-focused elements, while maximum intensity also varied, though it was influenced by domain-initial strengthening. These results suggest that Saudi speakers use both word order and prosody to mark focus, indicating that prosodic cues are complementary rather than redundant. In addition, the findings contribute to the broader theoretical debate on the syntax-prosody interface and imply the need for a revised typology of focus-marking strategies that integrates both prosodic and syntactic methods.
- Research Article
- 10.1353/lan.2013.0055
- Sep 1, 2013
- Language
Reviewed by: Word order by Jae Jung Song Thomas Wasow Word order. By Jae Jung Song. (Research surveys in linguistics.) Cambridge: Cambridge University Press, 2012. Pp. xvi, 348. ISBN 9780521693127. $54.50. This book presents a very useful overview of a considerable body of literature. To survey a topic as big as word order in a book of manageable size, it was necessary to make choices about what to omit. Jae Jung Song’s choices will not make all readers happy, but they are not unreasonable ones. First, the literature covered is largely confined to work of the last thirty years. Second, the only theories of grammar considered are the minimalist program (MP) and optimality theory (OT).1 As a consequence, some extremely interesting ideas about how to handle word-order variation never get mentioned. In particular, there is no mention of the idea of decoupling linear precedence and immediate dominance in a phrase structure grammar, an idea that led to some productive research in generalized phrase structure grammar (Gazdar & Pullum 1981), lexical-functional grammar (Falk 1983), and head-driven phrase structure grammar (Reape 1993). The book has seven chapters: brief introductory and concluding chapters sandwich one surveying what is known about word-order typology (which S abbreviates LT, for linguistic typology), two on MP, one on OT, and one on what S calls ‘the performance-based approach’—that is, corpus and experimental research on word order. Although S discusses these as though they were alternative theories of the same thing, they are in many ways not really comparable. LT seeks inductive generalizations over directly observable patterns, with relatively little attention devoted to explaining those generalizations. The performance-based approach tries to explain word-order patterns (both within and across languages) on the basis of processing efficiency2—that is, what makes utterances easy or hard to produce and comprehend. MP, by contrast, has little concern with what is directly observable or with the efficiency of linguistic processing. Rather, it seeks to deduce properties of language from three ‘dimensions to the minimalist position: (1) virtual conceptual necessity; (2) economy; and (3) symmetry’ (80). Facts about languages play a role in this enterprise only to the extent that they can be shown to follow from or contradict the analyses so deduced. The OT research S presents shares a largely top-down approach with MP, but, as S writes, ‘OT needs to take into account what word order actually looks like on the surface’ (183). Later on the same page, S characterizes LT as ‘data-driven’, and MP (and its transformational predecessors) as ‘theory-driven’, and says ‘OT seems to strike a balance’ between the two. [End Page 661] As a practitioner of the ‘performance-based approach’ who has not kept up with the other literature S surveys, I learned a great deal from reading Chs. 2–5. Ch. 6 covers material I was already familiar with, allowing me to assess the accuracy and completeness of S’s coverage. In what follows, I comment on these chapters individually. Ch. 2 examines the linguistic-typological approach. The method of LT, though inductive, presupposes prior theoretical choices. Going back to Greenberg’s (1963) work in this area, typo-logical generalizations about word order have been expressed in terms of subject, object, verb, preposition, postposition, and so on. These categories and their application to particular utterances involve implicit theorizing. Moreover, typologists’ claims about the word order of a particular language are claims about basic or predominant orders—claims that require examination and analysis of a great deal of primary data. Given these complexities, it is striking how much progress has been made in LT over the half century since the publication of Greenberg’s seminal paper. The number of languages surveyed has vastly increased; the set of elements whose relative orderings have been tested and correlated has expanded; general formulations unifying various ordering correlations have been proposed and tested; family and areal tendencies have been explored; and some proposals have emerged for explaining why certain typological generalizations hold. This progress in LT has not been without internal disagreements—for example, over whether high-level generalizations about constituent ordering should be stated in terms of heads...
- Research Article
8
- 10.1016/0388-0001(96)00010-1
- Jan 1, 1996
- Language Sciences
Clefts, particles and word order in languages of Europe
- Research Article
- 10.1353/lan.1998.0257
- Jun 1, 1998
- Language
BOOK NOTICES 411 We view this collection of articles not only as a fruitful example of a collaborative research task but also as a valuable contribution to the study of Danish prosody. It should be profitable reading for students and scholars alike and will prove a useful reference book in the study of general prosody. [Pilar Prieto i Vives, Universität de Vic, Spain.] Introduction to typology: The unity and diversity of language. By Lindsay J. Whaley. Thousand Oaks: Sage Publications , 1997. Pp. xxvi, 280. Paper $22.95. A broad descriptive overview of the goals, history, and methods of typological analysis as well as a comparison , classification, and explanation of shared properties of the world's languages constitutes the basis of this introductory account. Designed to complement existing introductory books on typology, Introduc tion to typology incorporates developments in the field in the past decade in addition to covering topics such as tense, aspect, subordination, and coordination . Areas of traditional interest to typology (constituent order, morphological types, hierarchies, and animacy I are also addressed Careful attention to organization, definitions, and clarity of presentation promises to make this book a popular reference for students, teachers, and scholars. The book has six parts subdivided into sixteen chapters. The volume also contains a glossary (281-92), references (293-304), an index (305-21), and an introduction, 'The world's languages in overview ' (xvii-xxiii). A map of the languages cited in the book rather than a list would seem more appropriate for an introductory text Part I, 'Basics of language typology', includes four chapters. Ch. 1, 'Introduction to typology and universals ' (3-17), defines language universals and typology Ch. 2, ? (brief) history of typology' (18-29), highlights the major contributions of early typologists (Humboldt, Greenberg, etc ). An engaging discussion of their discoveries, subsequent controversies , and verification of early insights makes for compelling reading. Ch. 3, 'Issues of method and explanation' (30-53), addresses major typological methods and explanations with concomitant controversies . Ch. 4, 'Basic categories' (54-75), considers the approaches and complications of defining lexical classes, semantic roles, and grammatical relations cross-linguistically. Part II, 'Word order typology', includes two chapters . Ch. 5, 'Constituent order universals' (79-95), introduces constituent order and potential correlations with the ordering of syntactic categories. One explanation for these correlations (branching direction theory) demands a more extensive understanding of current syntactic theory than an introductory linguistics course provides. Ch. 6, 'Determining basic constituent order' (96-107), looks at frequency and markedness as tests for determining the basic word order of a language Part III, 'Morphological typology', consists oftwo chapters Ch 7, 'Morphemes' (111-26), a descriptive classification of morphemes, is the basis for the subsequent discussion of the morphological classification of languages in Ch. 8, 'Morphological typology ' (127-48). Relatively recent work on head and dependent marking is included. The remaining three parts compare and classify selected grammatical constructions rather than languages . The chapters are generally descriptive with relatively less explanation given for these constructions . Part IV, 'Encoding relational and semantic properties of nommais' . consists of Ch. 9. 'Case and agreement systems' (151-69), Ch. 10, 'Animacy, defimteness, and gender' (170-81 ); and Ch. 11, 'Valence ' (183-200). Part V, 'Verbal categories', consists of Ch 12, 'Tense and aspect' (203-18); Ch 13. 'Mood and negation' (219-32); and Ch. 14 'Morphosyntax of speech acts' (233-44) Part VI, 'Complex clauses', consists of Ch 14, 'Subordination' (247-66), and Ch. 15, 'Coordination and cosubordination ' (267-80). [Mayrene Bentley. Michigan State University.] Knowledge and skills in translator behaviour. By Wolfram Wilss. (Benjamins translation library, 15.) Amsterdam & Philadelphia: John Benjamins, 1996. Pp. xiii, 259. Is translation science or art? This false dichotomy is often contemplated but seldom engaged with the same balance and insight as by Wilss. For W. translation studies (TS), combining theory, methodology, and practice (ix). needs to embrace the underlying principle that m translator performance, 'knowledge' and "skills' are inseparable At some point, all translators will ask themselves the question: 'What is actually happening in my mind when I translate?' Grappling with this apparent imponderable , W aims to give his readers 'insight into what translators really do and to explain the concepts and tools of the trade...
- Research Article
1
- 10.6018/ijes/2011/2/149661
- Dec 1, 2011
- International Journal of English Studies
This paper is concerned with how <em>there</em>-constructions may have helped to achieve discourse coherence in the recent history of English. From the theoretical framework of Meta-Informative Centering Theory (MIC) the paper explores the possibility to establish a relation between the syntactic structures under analysis and the distinction between 'smooth-shift' and 'rough-shift' transitions from one centre of attention to another (Brennan, Friedman &amp; Pollard, 1987). This will help, ultimately, to investigate the interaction between centering and MIC theories, word order and information structure in a 'non-free' word order language such as English. A corpus- driven analysis of the behaviour of spoken and written <em>there</em>-constructions from late Middle English to Present Day English will show their capacity to function either as highly coherent structures that continue with the same local topic as the previous utterance(s), or as means to shift the local focus of attention.