Abstract

AbstractIn this study, we use the General Regionally Annotated Corpus of Ukrainian (GRAC, www.uacorpus.org) as an experimental field for testing stylometric approaches for variationist analysis. While, in the last years, quantitative methods such as binomial mixed-effects regression models as well as machine-learning methods such as random forests have gained considerable popularity in corpus linguistics, methods from stylometry have not been used for variation-linguistic analysis very often. Using data from GRAC, we show that a stylometric approach can be useful to analyze the diachronic development of Standard Ukrainian in the 20th century. We take departure from the two main variants of Standard Ukrainian used in the interwar period in Soviet Ukraine, on the one hand, and Western Ukraine as it was part of the Polish republic, on the other. We ask: what can stylometry tell us about how these standards differed and about their subsequent fate in enlarged Soviet Ukraine after WWII?Our analysis shows that certain specifically Western Ukrainian features common during the first decades of the 20th century did not find their way into the post-WWII standard, while others were retained. Moreover, we show that, by and large, stylometry shows a stronger continuity of the Eastern than the Western standard.Methodologically, we demonstrate that stylometry can be used as a tool to start corpus-linguistic research from a bird’s-eye view and in an inductive manner, without formulating any hypotheses regarding particular variables, and later zoom in on hitherto unknown variables representing regional or diachronic differences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call