In this paper, we take a close look at a web platform that provides the tools necessary for working with folklore materials and conducting scientific research based on them. Folklore studies consist of working with audio and video materials, which contain the reproduction of elements of folk art in national languages, creating specific text recordings with translation and comments, written in a public language, and building a picture of the worlds based on available resources. To structure and present this content, we use an ontology-based approach, which allows linguists to describe not only the resources, but also subject knowledge in the Semantic Web style, i.e. using hierarchies of classes, objects and relationships between them. The main feature of folklore research is the need for synchronization of translations, which is achieved by creating a parallel corpora of texts, and the ability to label texts with entities of the subject area, which is called semantic markup. Moreover, each corpus is connected with a certain nationality and has both its own national language and unique system of concepts of the world around it. Such representation imposes many non-standard requirements for the platform, such as working with arbitrary languages, supporting many ontologies, ensuring the creation and editing of national subject ontologies, semantic text markup, presentation, navigation, and search across heterogeneous resources. The developed platform provides all the necessary tools for research, including tools for the development of ontologies in specific national subject areas and manual annotation of texts in real time by several specialists. Resources of the web-platform are located in the resource ontology, which includes such concepts as corpus, video resource, audio resource, graphic image, person, geographical location, genre of text, etc. Ontologies of subject areas are presented in the form of a hierarchy, where the ontology of universals, common to all folklore studies, is located at the top level. At the same time, inherited ontologies are specialized for each represented national corpus. The web application is built with Python Django framework and the TypeScript React library. Data storage is implemented using the Postgres database.
Read full abstract