Abstract

On the Principles of a Digital Text Corpus:New Opportunities in Working on Heroic Epics of the Shors Dmitri A. Funk (bio) Introduction The Corpus of Folklore Texts in the Languages of Indigenous Peoples of Siberia (http://corpora.iea.ras.ru) is a collection of folklore texts in lesser used, mostly endangered, Siberian languages and was initiated in 2011 through support from the Department of Northern and Siberian Studies at the Institute of Ethnology and Anthropology of the Russian Academy of Sciences (RAS).1 The main idea was to create a corpus capable of storing both an original version and an orthographically standardized version of any text. Another task was to build a system able to perform some related analytical procedures. At the same time we were aiming at making a significant part of the unknown materials available for researchers and others able to read in the native Siberian languages. This essay will use the Shor corpus as its main example, though the Teleut, Evenki, or Nenets corpora might easily have been chosen instead. The reasoning behind this choice is the fact that the whole project arose out of my long-term study of the Shor epics. Additionally, the Shor (like the Teleut) materials belong for the most part to me, and I am the primary individual who has been working with them within the frame of this project. It is my hope that this single example will help readers better understand what our Corpus is able to do and how it has been organized. The Shors and Their Epics The Shors are one of Siberia’s smaller populations: 12,888 according to the 2010 Russian census.2 This ethnic group lives mainly in the south of Western Siberia. Until the early twentieth century they provided a perfect example for illustrating the cultural way of life of taiga hunters, gatherers, and fishermen; they now primarily occupy towns in the southern part of the Kemerovo region, where we currently find about 73% of the entire Shor population. In the last 24 years not only have the Shors dwindled in number, but the percentage of those who command the Shor language has fallen as well. Only a small percentage of the Shor—around 5 to 10% (officially 22%)—are still able to speak Shor as their mother tongue, making it an extremely endangered Turkic language.3 This ethnic group is especially well known thanks to its rich heroic epic tradition, examples of which have been recorded over the last 150 years by Wilhelm Radloff, Alexander V. Adrianov, Nadezhda P. Dyrenkova, Georgij F. Babushkin, Alexander I. Smerdov, Olga I. Blagoveshchenskaya, and Andrei I. Čudoyakov, as well as by other scholars, enthusiasts, and even some story-tellers.4 Like epic tales of many other Turkic-speaking groups of Siberia—and as is also the case for Mongolian heroic epic tales—Shor epics most often involve either one or both of two themes: one where the hero embarks upon a quest for a wife and another where the hero struggles with foreign invaders. But these two main themes can be and still are realized in hundreds of variations or motif-series, using the whole richness of an epic tradition.5 Click for larger view View full resolution Fig 1. Main territories now occupied by the Shors (originally published in Funk and Tomilov 2006:247). I point especially at heroic epics because they are of special value in studying the language;6 indeed it is through the language of the heroic epics—and this fact has been long ascertained by linguists—that the Shor language achieved its “higher,” one could say “literary,” form. It is in the heroic epics that we discover the richness both of the lexical makeup of the language and of its grammatical structure, with every form from the simplest syntax to the most complex being featured within these works. Principles of the Corpus There are five main principles on which the Corpus is based: 1. To increase the number of epic texts available for researchers There are at least 265 texts of the Shor epics stored in different archives and/or private collections. From these rich materials there were but 26 epic...

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.