Automated Normalization and Analysis of Historical Texts

Paweł Skórzewski,Filip Graliński,Krzysztof Jassem

doi:10.1007/978-3-030-66527-2_6

Automated Normalization and Analysis of Historical Texts

Paweł Skórzewski, Filip Graliński + Show 1 more

https://doi.org/10.1007/978-3-030-66527-2_6

Copy DOI

Publication Date: Jan 1, 2020

Citations: 1

Affiliation: Adam Mickiewicz University in Poznań

#Analysis Of Historical Texts #Historical Texts + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

The paper presents an original method for processing historical texts. A historical text is converted into its modernized equivalent by a tool called diachronic normalizer, embedded into a linguistic toolkit. The solution has a few merits. Firstly, the toolkit architecture allows for imposing the morphological constraints on diachronization rules. Secondly, the diachronic normalizer may be launched in the pipeline together with other NLP tools, such as parsers or translators. Lastly, the toolkit makes it possible to efficiently apply, in the diachronic normalization, a long list of diachronic pairs, found out with the aid of word distribution vectors in historical corpora.

Full Text