Corpus Investigations of Medieval Slavonic Manuscripts: Statistically Important N-Grams (Collocations) of Old Russian Chronicles

Viktor Baranov

doi:10.18254/s207987840009440-3

Abstract

The paper deals with the current status of preparation of Slavonic historical textual corpora and requirements for them from the point of view of processing, search and demonstration of linguistic data. It is stressed that the main causes of the slow development of this line are high labor expenditures of manual creation of machine-readable transcriptions and their tagging and the necessity of training of special corpus managers providing access to data and its visualization. It is emphasized that one of the lines of use of corpus data of current importance is its analysis with the help of quantitative and statistic methods. There is a description of the functional possibilities of the historical corpus “Manuscript” comprising medieval Slavonic manuscripts of the 10th — 15th centuries (manuscripts.ru). The possibilities of the module of n-grams for revelation of grammatically and semantically set expressions characterizing the text subjects are demonstrated on the example of subcorpus of three Old Russian chronicles (Laurentian, Hypatian, Radzivilovsky). Statistic measures Mutual Information and T-score help to reveal the lists of relatively rare and more frequent set expressions. MI-lists include proper names, pair names, set biblical and Slavonic-bookish subordinating constructions. T-score lists give information on the events, goals, persons, outputs and their characteristics. A conclusion on the efficiency of application of statistic measures to automatic finding of the semantically and thematically important expressions in the historical sources is made.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Corpus Investigations of Medieval Slavonic Manuscripts: Statistically Important N-Grams (Collocations) of Old Russian Chronicles

Abstract

Talk to us

Similar Papers

More From: ISTORIYA

Lead the way for us

Similar Papers

АКСІОЛОГЕМА «СМИРЕННЯ» У ГРЕЦЬКІЙ РАННЬОХРИСТИЯНСЬКІЙ ТА УКРАЇНСЬКІЙ ЛІНГВОКУЛЬТУРАХ: ДІАХРОНІЧНИЙ ВИМІР
Oleksandr Levko
Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ» | VOL. 1
Oleksandr LevkoOleksandr Levko
29 Mar 2018
Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ» | VOL. 1

Rank Office Records from the RASL Manuscript 16.17.34 as a Source on the History of the Kazan Campaign of 1552
Artem Zhukov
Slovene | VOL. 9
Artem ZhukovArtem Zhukov
01 Jan 2019
Rank Office Records from the RASL Manuscript 16.17.34 as a Source on the History of the Kazan Campaign of 1552
Artem Zhukov

Miniature as a Source of Information about Russian Shipbuilding Culture of the 15th—17th Centuries
Lidia V Madikova
Observatory of Culture | VOL. 20
Lidia V MadikovaLidia V Madikova
31 Mar 2023
Miniature as a Source of Information about Russian Shipbuilding Culture of the 15th—17th Centuries
Lidia V Madikova

Сказания о чудотворных иконах в структуре Лицевого летописного свода
Liudmila Zhurova
Quaestio Rossica | VOL. -
Liudmila ZhurovaLiudmila Zhurova
01 Jan 2015
Quaestio Rossica | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Corpus Investigations of Medieval Slavonic Manuscripts: Statistically Important N-Grams (Collocations) of Old Russian Chronicles

Abstract

Talk to us

Similar Papers

More From: ISTORIYA