Corpus-based studies of synchronic and diachronic variation

  • TL;DR
  • Abstract
  • Literature Map
  • Similar Papers
TL;DR

This chapter reviews how corpora are used to study linguistic variation across time and contexts, focusing on diachronic change from Old to Modern English and synchronic variation through multi-dimensional and variationist approaches, highlighting the methodological sophistication and the importance of corpus data in tracking gradual language change and social or genre-based differences.

Abstract
Translate article icon Translate Article Star icon

Introduction In this chapter, we turn our attention to the issue of linguistic variation, and how corpora have been employed to study differences in the English language across time and across different contexts of language use. We can interpret variation in a number of different ways. One is change over time or diachronic variation. In the two sections that follow, we will look at the use of corpora to study language change in pre-contemporary and contemporary English, respectively. Yet while corpus-based analysis of language change is a broad field, the study of synchronic variation is even more extensive. In exploring corpus-based approaches to synchronic variation, we will focus on two rather distinct approaches. One approach, touched on briefly in the previous chapter, is strongly associated with Douglas Biber and colleagues; this is the so-called multi-dimensional (MD) approach. The other is associated with variationist socio-linguistics. Although, as we will see, these approaches have certain commonalities, they are distinct in that the MD approach looks at variation across genre (or register), with the individual text as the unit of variation, whereas variationist sociolinguistics looks at variation across class, gender or other social category, with the individual speaker as the unit of variation. We will discuss the MD approach, in particular, at some length, because it is methodologically extremely distinct and statistically sophisticated. Diachronic change from Old English to Modern English Looking at language change is an area of linguistics for which corpus data is particularly appropriate. No one now alive speaks Middle English as a native tongue, much less Old English; thus, even if we wish to rely on the judgements of a native speaker, we simply cannot. Instead, for these and other extinct languages there is a fixed ‘corpus’ of surviving texts which will never grow any further, except in the rare circumstance that hitherto unknown texts are discovered. An electronic corpus composed of all of these surviving texts (or a sampled subset of them) is thus the ideal tool for taking into account as much data on these historical forms as possible in an analysis of how language has changed. The quantitative analyses enabled by corpus methods are also highly valuable for the study of language change. One quite consistent finding of research in historical linguistics is that one structure very rarely replaces another in a single, sudden change. Rather, new structures arise and are initially used infrequently, and then may later increase in frequency of use, perhaps in competition with some established structure (some examples are discussed in the following section). This kind of quantitative pattern is ideally tracked by a corpus sampling texts across time.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.1057/s41599-022-01488-8
Regional varieties and diachronic changes in Chinese political discourse
  • Dec 24, 2022
  • Humanities and Social Sciences Communications
  • Renkui Hou + 2 more

The present paper explores the synchronic variations and diachronic changes in political discourses in Hong Kong (HK) and in Mainland of People’s Republic of China (PRC). The relationship between lengths of linguistic constructs and their immediate constituents (including sentences and clauses, and clauses and words) are fitted using the function y = axb based on the Menzerath–Altmann (MA) law to capture the characteristics of language as self-organizing complex systems. We found that the two fitted parameters a and b, as distinctive characteristics of complex systems, can distinguish two regional variants of political speeches from HK and PRC over different periods in time. We also found that the same parameters can capture language changes between different periods of political speeches from the PRC. More specifically, we found that regional variations and historical changes show different degrees of salience at different constituency levels. In addition, we found compounding effects between historical change and regional variations. That is, the two regional variants of political speeches are closer to each other at the earliest diachronic period as compared with the latter two periods, as represented by the fitted parameters of the relationship between sentence and clause lengths. Our results provide strong support for the hypothesis for the MA Law capturing the characteristics of language as a complex self-organizing system, as the two fitted parameters account for the interaction of diachronic language change and synchronic variation.

  • Single Book
  • Cite Count Icon 2
  • 10.1075/slcs.193
Germanic Genitives
  • Apr 16, 2018
  • Horst J Simon

The papers in this volume focus on the dynamics of one specific cell in morphological paradigms – the genitive. The high amount of diachronic and synchronic variation in all Germanic languages makes the genitive a particularly interesting phenomenon since it allows us, for example, to examine comparable but slightly different diachronic pathways, the relation of synchronic and diachronic variation, and the interplay of linguistic levels (phonology, morphology, syntax, and semantics). The findings in this book enhance our understanding of the genitive not only by describing its properties, but also by discussing its demarcation from functional competitors and related grammatical items. Under-researched aspects of well-described languages as well as from lesser-known languages (Faroese, Frisian, Luxembourgish, Yiddish) are examined. The papers included are methodologically diverse and the topics covered range from morphology, syntax, and semantics to the influence of (normative) grammars and the perception and prestige of grammatical items.

  • Research Article
  • Cite Count Icon 22
  • 10.1016/j.jeap.2023.101262
Lexical complexity changes in 100 years’ academic writing: Evidence from Nature Biology Letters
  • May 25, 2023
  • Journal of English for Academic Purposes
  • Xinye Zhou + 2 more

Lexical complexity changes in 100 years’ academic writing: Evidence from Nature Biology Letters

  • Research Article
  • Cite Count Icon 1
  • 10.1353/lan.2019.0022
Quantitative historical linguistics: A corpus framework. By Gard B. Jenset and Barbara McGillivray. Oxford: Oxford University Press, 2017. Pp. xiii, 229. ISBN 9780198718178. $88 (Hb).
  • Mar 1, 2019
  • Language
  • Dirk Geeraerts

Reviewed by: Quantitative historical linguistics: A corpus framework by Gard B. Jenset and Barbara McGillivray Dirk Geeraerts Quantitative historical linguistics: A corpus framework. By Gard B. Jenset and Barbara McGillivray. Oxford: Oxford University Press, 2017. Pp. xiii, 229. ISBN 9780198718178. $88 (Hb). The early twenty-first century has witnessed a major shift toward quantitative approaches in the methodology of linguistics. Specifically, whereas quantitative methods have long been a staple of sociolinguistic and psycholinguistic research, the past two decades have seen their expansion toward descriptive and theoretical grammar. In usage-based approaches to language in particular, like cognitive and probabilistic linguistics, a ‘quantitative turn’ has occurred that applies the statistical testing of hypotheses to data derived from text corpora. The central inspiration for Gard B. Jenset and Barbara McGillivray’s book is the observation that this turn toward quantitative corpus studies has not yet penetrated historical linguistics to the same extent as some other subfields of linguistics. It accordingly sets out to introduce ‘the framework for quantitative historical linguistics’. The seven chapters fall roughly into two parts. In Chs. 1 to 3, a general argumentation in support of quantitative historical linguistics is developed, whereas Chs. 4 to 7 deal with the implementation of the ensuing program. The discussion of ‘why’ thus leads naturally to a discussion of ‘how’. Two threads run through the first part of the text: a specification of the kind of quantitative historical linguistics that the authors intend to propagate, and an argumentation in favor of the model in question. Important features of this argumentation are a description of the actual situation in historical linguistics and a conceptual defense of the approach against potential objections. Organizationally, Ch. 1 introduces both threads, Ch. 2 develops the first thread, and Ch. 3 the second. With regard to the first thread, the first chapter introduces the authors’ notion of quantitative research in historical linguistics by means of a double contrast. On the one hand, quantitative research differs from the conventional use of evidence in historical linguistics that rests on example-based categorical judgments about the existence of specific linguistic phenomena but does not look into probabilistic, distributional data about trends of variation and change of the phenomenon in question. On the other hand, quantitative historical research needs to go beyond raw frequencies, in the sense that the multidimensional nature of language requires a multivariate statistical approach. In the second chapter, this conception is further developed in terms of the distinction between corpus-based and corpus-driven approaches. Whereas the former turn to corpora primarily for illustration and confirmation, the latter use corpus data at two stages of the empirical process: corresponding to the distinction between exploratory and confirmatory statistics, quantitative distributional evidence is initially used to generate hypotheses, and subsequently for testing them. With regard to the second thread, the text provides quantitative data (appropriately, one could say) to the effect that such a method is less entrenched in historical linguistics than other fields of linguistics. This argumentation rests on a comparison of the 2012 volume of Language with six journals with a (not necessarily unique) focus on language change, such as Diachronica, Folia Linguistica Historica, and Language Variation and Change. As an explanation for the observation that historical linguistics seems to be lagging behind, the book invokes early negative experiences with glottochronology, plus the influence of structuralist and generative theories (though this is of course a factor that is not specific to historical linguistics). At the same time, it is demonstrated how the rise of quantitative linguistics goes hand in hand with the growing availability of electronic corpus materials—a trend that obviously creates an opportunity for historical linguistics just as for the other branches of linguistics. [End Page 190] Next to the ‘the time is ripe, we shouldn’t lag behind’ argument, the plea for quantitative corpus research in historical linguistics includes a ‘nothing is wrong with it’ type of argumentation, in the form of a systematic rejection of potential objections. Section 3.7 skillfully refutes counterarguments from convenience, from redundancy, from scope limitations, from principle, and from pseudoscience. Crucially, it is argued that a quantitative approach is not incompatible with a categorial...

  • Single Book
  • Cite Count Icon 68
  • 10.1075/slcs.133
Synchrony and Diachrony
  • May 17, 2013

1. Acknowledgements 2. List of contributors 3. Synchrony and diachrony: Introduction to a dynamic interface (by Giacalone Ramat, Anna) 4. Part I. The role of analogy and constructions in the synchrony-diachrony interface 5. Gradualness in language change: A constructional perspective (by Trousdale, Graeme) 6. Gradual change and continual variation: The history of a verb-initial construction in Welsh (by Currie, Oliver) 7. Can you literally be scared sick?: The role of analogy in the rise of a network of Resultative and Degree Modifier constructions (by Margerie, Helene) 8. The reputed sense of be meant to: A case of gradual change by analogy (by Disney, Steve) 9. Gradualness in analogical change as a complexification stage in a language simplification process: A case study from Modern Greek dialects (by Melissaropoulou, Dimitra) 10. Part II. Synchronic variation and language change 11. Semantic maps, for synchronic and diachronic typology (by Auwera, Johan van der) 12. Synchronic gradience and language change in Latin genitive constructions (by Magni, Elisabetta) 13. Double agreement in the Alpine languages: An intermediate stage in the development of inflectional morphemes (by Wratil, Melani) 14. On variation in gender agreement: The neutralization of pronominal gender in Dutch (by De Vos, Lien) 15. Synchronic Variation and Grammatical Change: The case of Dutch double gender nouns (by Semplicini, Chiara) 16. A case study on the relationship between grammatical change and synchronic variation: The emergence of tipo[-N] in Italian (by Voghera, Miriam) 17. Grammaticalization in the present - The changes of modern Swedish typ (by Rosenkvist, Henrik) 18. Part III. Gradualness in language change 19. Gradualness in change in English (augmented) absolutes (by Pol, Nikki van de) 20. Grammatical encoding of referentiality in the history of Hungarian (by Egedi, Barbara) 21. Gradualness in contact-induced constructional replication: The Abstract Possession construction in the Circum-Mediterranean area (by Fedriani, Chiara) 22. Binding Hierarchy and peculiarities of the verb potere in some Southern Calabrian varieties (by De Angelis, Alessandro) 23. Author index 24. Subject index

  • Research Article
  • 10.1111/1460-6984.70246
The Stability of Oral Language Profiles of Children in the Early Years of School: A Longitudinal Comparison of Multidimensional and Cut-Point Approaches to Classification.
  • Apr 12, 2026
  • International journal of language & communication disorders
  • Anna Louise Taylor + 6 more

A cut-point approach to classifying children's language abilities uses a specific threshold to determine whether an individual falls into a particular group, such as children with 'typically developing language' or 'language difficulties.' This method has been frequently used in longitudinal research to track language during the early school years. Findings have suggested that language difficulties may persist, emerge or resolve during this time. This longitudinal study with stratified sampling investigated oral language profiles using a multidimensional assessment framework, comparing results across multidimensional and cut-point approaches and exploring how language profiles relate to children's functioning in Year 1. We assessed 90 children across multiple dimensions of oral language at school entry and followed them up one year later. A statistical method of combining data sources to look for groups with common characteristics (latent profile analysis) was used to identify language profiles and transitions between them. To compare the results with a cut-point approach, children were subsequently reclassified into two groups using a single cut-point from an omnibus test of oral language. Profile-related differences in early academic and psychosocial outcomes were compared using a Multivariate Analysis of Variance. Follow-up analyses using McNemar's test examined whether differences in classifications from the two classification methods were statistically significant. Three language trajectory profiles were identified using the multidimensional approach: stable average, stable low and improving. The cut-point method identified these same profiles and a small declining profile. Notably, more children were classified in the stable low group using the multidimensional approach compared to the cut-point method, and this difference was statistically significant. In Year 1, children classified into language profiles characterised by average or above-average abilities exhibited significantly stronger early academic outcomes compared to those in profiles associated with language difficulties. The use of a multidimensional assessment may result in greater consistency of categorical classifications over time for students with language difficulties. Further research is needed to explore the potential clinical utility of this approach to support the accurate and early identification of students with language difficulties and disorders. What is already known on the subject Previous studies that used cut-point methods on scores from omnibus or domain-specific tests have shown that some children demonstrate improving or declining oral language trajectories throughout their years of school. However, these methods may not fully capture the complexity of language growth and change over time. What this paper adds to the existing knowledge This study offers new evidence on how a multidimensional assessment approach affects the classification and stability of language trajectories from school entry to Year 1. While a multidimensional approach identifies a greater number of children with language difficulties, it also reveals greater stability in language profiles over time, particularly for children with more significant challenges. What are the potential and actual clinical implications of this work? The findings reinforce the need to adopt multidimensional assessment practices in research and clinical settings. Importantly, the high stability of language profiles identified using this method may increase clinician confidence in accurately identifying language difficulties at school entry.

  • Research Article
  • Cite Count Icon 168
  • 10.1177/002383099303600303
Coarticulation and phonology.
  • Apr 1, 1993
  • Language and Speech
  • John J Ohala

Many sound patterns in languages are cases of fossilized coarticulation, that is, synchronic or phonetic contextual variation became diachronic or phonological variation via sound change. An examination of languages' phonologies can therefore yield insights into the mechanisms of coarticulation. In this paper I discuss (a) the need to differentiate between phonological processes that are and are not due to coarticulation, (b) the need to differentiate between 'on-line' synchronic variation and comparable fossilized diachronic variation, (c) how to determine some of the constraints on coarticulation--especially the higher priority of maintaining acoustic-auditory, rather than articulatory, norms for the shape of speech elements, and (d) how coarticulation presents a "parsing" problem to the listener and, of course, to systems for automatic speech recognition.

  • Research Article
  • 10.1163/2405478x-90000103
Natural Sound Change and Its Patterns: The Case of the Lateral Approximant l
  • Jan 24, 2012
  • Bulletin of Chinese Linguistics
  • Wei Zheng

In light of Chinese historical phonology, modern dialects, languages of Chinese minorities and field phonetics, this paper discusses (1) the development of the Yi-initial words from Old Chinese to Middle Chinese, (2) the development of the Lai-initial words from Middle Chinese to modern dialects, (3) the phonological behavior of segment l in different syllabic positions from the perspective of evolutionary phonology. Such evolutionary developments as palatalization, velarization, nasalization, labiodentalization, fricativization, strengthening and so on can be identified for approximant l. This provides an important panchronic and typological perspective for the interpretation of both diachronic changes and synchronic variation.

  • Research Article
  • Cite Count Icon 35
  • 10.1111/lnc3.12281
Old‐age language variation and change: Confronting variationist ageism
  • May 28, 2018
  • Language and Linguistics Compass
  • Heike Pichler + 2 more

The speech of older adults (65+ years old) is a rich resource for a wide range of researchers, including oral historians, developmental psychologists, health communication scholars, speech and hearing specialists, and discourse analysts. Yet in variationist sociolinguistics—the study of language variation, language change, and their social motivations—older adults have fallen afoul of a kind of scholarly ageism. Often consigned to the status of a historical benchmark against which the speech of younger people is compared, and with only rare acknowledgment of their biological, psychological and social diversity, old‐age speakers deserve greater attention. This article provides linguists with an overview of relevant conceptualizations of age and ageing in gerontology, explains why a focus on older speakers is critical to the advancement of the study of language variation and change, and offers practical suggestions for overcoming some of the challenges associated with old‐age research.

  • Research Article
  • Cite Count Icon 91
  • 10.1515/cogl.2011.001
Variation, change and constructions in English
  • Feb 1, 2011
  • cogl
  • Thomas Hoffmann + 1 more

All human languages are characterised by inherent synchronic variability (Hudson, Cognitive Linguistics 8: 73–108, 1997, English Language and Linguistics 11: 383–405, 2007a) and are subject to change over time. Consequently, due to this central role of variation and change, any explanatorily adequate cognitive theory of language should aim to account for both of these phenomena. The present special issue explores how usage-based Construction Grammars can address issues of linguistic variation and change. In particular, focusing on English, we will show how constructionist approaches provide new insights for the study of variation and change in the English language as well as how data from English can help to refine construction grammar theories. This introduction will give a short overview of aspects of constructionist approaches to language which are of relevance to the modelling of linguistic variation and change. In addition to our discussion of the modelling of synchronic and diachronic variation in construction grammar, we provide an overview of the topics addressed by the seven articles in this special issue.

  • Single Book
  • Cite Count Icon 10
  • 10.1075/ihll.9
Nasals and Nasalization in Spanish and Portuguese
  • Apr 21, 2016
  • C Elizabeth Goodin-Mayeda

Nasality, whether part of a consonant or vowel, has certain phonetic and phonological characteristics that lead to outcomes seen time and again in languages with and without common ancestries. Spanish and Portuguese constitute a particularly fruitful language pairing for studying phonological aspects of synchronic and diachronic variation, given their intimate relationship as well as the array of dialectal variation in each. This research monograph offers a comprehensive exploration of nasals and nasalization in Spanish and Portuguese with a special focus on the role of perception in order to provide insight into how perception informs models of phonetics, phonology and language change. Of interest to researchers and advanced students alike, this volume integrates phonetic and phonological models of speech perception and production, and discusses these with regards to original empirical research on the perception of nasal place features and vowel nasalization by listeners of Peninsular Spanish, Cuban Spanish and Brazilian Portuguese.

  • Book Chapter
  • Cite Count Icon 40
  • 10.1163/9789401202213_005
Synchronic and diachronic variation: the how and why of sociolinguistic corpora.
  • Jan 1, 2006
  • Kate Beeching

This paper aims to illustrate the potential of (spoken) sociolinguistic corpora for research studies in both synchronic and diachronic variation, with reference to French, and to suggest ways in which useful research corpora may be established for future generations of scholars. Spoken corpora and corpus tools are an excellent heuristic in charting distributional frequencies or probabilistic factors. Andersen (2000) suggests that the upsurge of innit and like in the COLT Corpus of adolescent English may be more than age-grading. The present paper will present broad-brush preliminary evidence with respect to the evolution of selected pragmatic particles in French.

  • Research Article
  • Cite Count Icon 3
  • 10.3765/exabs.v0i0.768
Critical discourse analysis of synchronic and diachronic variation in institutional turn-allocation
  • May 7, 2013
  • LSA Annual Meeting Extended Abstracts
  • Michael A. Shepherd

Critical discourse analysis of synchronic and diachronic variation in institutional turn-allocation

  • Single Book
  • Cite Count Icon 2
  • 10.4324/9780429316517
Diachronic Perspectives and Synchronic Variation in Southern Min
  • Mar 5, 2020

Table of Contents List of Figures and Maps List of Tables Abbreviations Acknowledgements 1. Introduction Chinfa Lien and Alain Peyraube 2. Comparatives of inequality in Southern Min: a study in diachronic change from 15th to 21st centuries Hilary Chappell, Alain Peyraube, Song Na 3. The emergence of obligative modal tioh8 in Southern Min: a change induced by semantic-pragmatic factors Ting-ting Christina Hsu 4. Negation of dynamic modals with DIT in Hainan Min Huichi Lee 5. Word change and language change: a case of as a coordinating conjunction from Archaic Chinese gong to ka7 in Taiwanese Southern Min Lin Jang Ling Lin 6. Exploration of the benefactive marker kang7 in Ming Qing Southern Min script Chian-tang Su 7. Taiwanese Southern Min hoo7 and its counterparts in the Southern Min varieties in Quemoy and Quanzhou Chai-yin Hu 8. The etymology and grammaticalization of the continuative aspect marker le(h)4: a survey from the historical documents Manjun Chen 9. Kong2 as a verb for saying 'on the move' in Taiwanese Southern Min Chinfa Lien 10. Purposives in Taiwanese Southern Min Chinfa Lien and May Wang

  • Book Chapter
  • Cite Count Icon 2
  • 10.5167/uzh-136685
Variation and change: Historical pragmatics
  • Jan 20, 2017
  • Zurich Open Repository and Archive (University of Zurich)
  • Andreas H Jucker + 1 more

All living languages are subject to variation at all levels of organization. Some of this variation leads to language change if larger groups of speakers start giving preference to a specific innovation. In this contribution we survey work that traces the diachronic variability at the level of language use, a field of research that is called historical pragmatics. After outlining some of the specific data problems that historical investigations of pragmatic variation have to deal with, we provide an overview of historical pragmatic work on specific periods, on diachronic change and on synchronic variation in earlier periods. Finally, we introduce the concept of “pragmatic variable” and discuss various methodological and theoretical problems.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant