In this article, we discuss the conclusions of the project on digitizing and translating the Historical-Etymological dictionary of Ossetic by V.I. Abaev, which has been carried out by the authors since 2020. The entries of the original include information of several kinds: Russian translations of an Ossetic word, related examples, usage and etymological notes. The etymological notes are usually free-form texts with a large number of cited forms from other languages (cognates, loanword sources, etc.). We argue for maximal preservation and thorough encoding of this information when creating a digital database. The optimal format for such purposes is XML markup, more specifically its variant intended for annotating text documents, TEI (Text Encoding Initiative, https://tei-c.org/). For digitizing the dictionary, we created a modification of this markup language, a module for Oxygen XML Editor, and a set of CSS files that ensure maximum similarity between our layout and the original. The same combination of XML and CSS allows generating the print version of the dictionary using the Prince typesetting system (https://www. princexml.com/) without any additional processing. For ease of search and access, we suggest supplementing the XML sources with a relational database automatically created from the lexical entries. In our project, this database serves as the source for the online interface of the dictionary (https://ossetic.iranic.space), powered by the OnLex platform [Makarov et al. 2022]. Hence, the project combines the features of several database types, which fully corresponds to the multi-faceted character of the original dictionary. We hope that our experience can be used for other digital lexicographic projects.
Read full abstract