Abstract

The article focuses on archaeolinguistics as a separate field of knowledge and outlines the features that distinguish it from other disciplines in comparative studies. It analyses the existing text collections and shows how they may find application in a corpus-based research in ancient languages. It also discusses approaches to creating new corpora of texts. The study focuses on Old Church Slavonic and Ancient Greek, in particular, it analyses the existing corpora in these languages, e. g., Corpus Cyrillo-Methodianum Helsingiense. Most of the corpora under study are not tagged. Some of them change the original writing system (from Glagolitic to Latin, using, for instance, ASCII), while the others have a restricted access. Some of the corpora are no longer available at all or available as part of local databases only. Thus, corpus-based resources in ancient languages in question are obviously insufficient. To facilitate more effective research, the easiest possible solution is to develop new corpora by using platforms specializing in linguistic analysis (e. g., CDLI or Lingvodoc) or systems that support DIY corpora. However, such platforms are often paywalled, may have limited functionality, or lack comprehensive user guides. With all the above in mind, there seems to be no ready solution for archaeolinguists who want to use a corpus-based approach in their study. They either have to make a considerable effort to modify an existing system for their purposes, or to build one of their own. In conclusion, the article proposes one of the possible ways to address these issues.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.