Abstract

Abstract This article introduces a methodology for the diachronic analysis of large historical corpora, Usage Fluctuation Analysis (UFA). UFA looks at the fluctuation of the usage of a word as observed through collocation. It presupposes neither a commitment to a specific semantic theory, nor that the results will focus solely on semantics. We focus, rather, upon a word’s usage. UFA considers large amounts of evidence about usage, through time, as made available by historical corpora, displaying fluctuation in word usage in the form of a graph. The paper provides guidelines for the interpretation of UFA graphs and provides three short case studies applying the technique to (i) the analysis of the word its and (ii) two words related to social actors, whore and harlot. These case studies relate UFA to prior, labour intensive, corpus and historical analyses. They also highlight the novel observations that the technique affords.

Highlights

  • This article introduces a methodology, Usage Fluctuation Analysis (UFA), for the diachronic analysis of large historical corpora

  • It rests on two simple assumptions: (i) words co-occurring in the vicinity of other words provide insight into the words’ usage and (ii) the change in the pattern of co-occurrence of words over time can identify points where their usage changes

  • Our initial attempt to explore this issue came in McEnery & Baker (2017a) in which we looked at how the concept of collocation needed to change to take account of the dimension of time – if collocation is a window onto word meaning and usage (e.g. Brezina et al, 2015; Gablasova et al, 2017), it follows that it is no more possible to talk about a word having static collocates than it is to talk of a word having static usage

Read more

Summary

Introduction

This article introduces a methodology, Usage Fluctuation Analysis (UFA), for the diachronic analysis of large historical corpora. UFA looks at the fluctuation of word usage manifested through collocation, i.e. the co-occurrence of words in texts. The technique described in this article has some obvious common ground with what claim to be semantic approaches such as distributional semantics (Harris, 1954; Firth, 1957) and vector space models of lexical meaning The main purpose of our technique is analytical, i.e. it describes large amounts of evidence about word usage, in different contexts, that are available in historical corpora. It is first applied to the function word its, to the analysis of social actors through the words whore and harlot These studies investigate the value of the technique by comparing UFA to existing corpus-based historical analyses

Language change over time
9.102 design
Identification of collocates
Overlapping sliding window
Estimation of similarity between collocates
Non-parametric regression model
Reading a UFA graph
Case studies
UFA analysis of its1
UFA analysis of whore
UFA analysis of harlot
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.