Разработка правил генерации именных словоформ для новописьменных вариантов карельского языка

I P Novak,N B Krizhanovskaya,N A Pellinen,T P Boyko

doi:10.30624/2220-4156-2020-10-4-679-691

I P Novak, N B Krizhanovskaya + Show 2 more

Open Access

https://doi.org/10.30624/2220-4156-2020-10-4-679-691

Copy DOI

Export

Save

Cite

Journal: Bulletin of Ugric studies	Publication Date: Jan 1, 2020
Citations: 1

Abstract
Full-Text
Similar Papers

Abstract

Listen

Introduction: linking of words of texts (tokens) with meanings of lemmas in the dictionary of VepKar corpus significantly facilitates further work on semantic markup of texts. In 2019, inflectional rules were developed for the Vepsian subcorpora VepKar. To the corpus on the base of these rules a function for generation of a complete paradigm on basic word forms was added. VepKar editors need to enter a large number of word forms when they create dictionary entries in three Karelian subcorpora (about 30 for names and 150 for verbs). Therefore, the development of an algorithm and a computer program for generation of word forms of the Karelian language turned out to be timely. Objective: to illustrate how you can use the list of the stems of the nominal parts of speech of two new-written dialects of the Karelian language to create rules for automatic generation of word forms. Research materials: lemmas and word forms from the Open corpus of the Vepsian and Karelian languages, the Corpus of Border Karelia, and the electronic version of the Dictionary of the Karelian language. Results and novelty of the research: grammatical patterns were studied over many years from theoretical sources, and they were also discovered through experiments. Thanks to this, the list of stems and pseudo-stems of word forms was formed for the nominal parts of speech, the system of rules for generation of word forms was developed, and the corresponding computer program is written and tested. The scientific novelty of the study lies in the first attempt to develop uniform rules for the automatic generation of word forms for two dialects of the Karelian language.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Разработка правил генерации именных словоформ для новописьменных вариантов карельского языка

Abstract

Published Version

Talk to us

Similar Papers

More From: Bulletin of Ugric studies

Lead the way for us

Similar Papers

Генерация именных словоформ южнолюдиковского диалекта
...
BULLETIN OF UGRIC STUDIES | VOL. 14
, et. al. ...
01 Jan 2024
BULLETIN OF UGRIC STUDIES | VOL. 14

The algorithms of wordform generation and recognition
Alexander V Prutzkow
-
Alexander V PrutzkowAlexander V Prutzkow
01 Jun 2014
01 Jun 2014

THE LINGUISTIC CORPUS VEPKAR IS A LANGUAGE REFUGE FOR THE BALTICFINNISH LANGUAGES OF KARELIA
Татьяна Петровна Бойко ... Alexandra Rodionova
Proceedings of the Karelian Research Centre of the Russian Academy of Sciences | VOL. -
Татьяна Петровна Бойко, et. al.Татьяна Петровна Бойко ... Alexandra Rodionova
28 Jul 2021
Proceedings of the Karelian Research Centre of the Russian Academy of Sciences | VOL. -

Using genericity to create cutomizable finite-state tools
Sandro Pedrazzini ... Marcus Hoffmann
-
Sandro Pedrazzini, et. al.Sandro Pedrazzini ... Marcus Hoffmann
01 Jan 1998
01 Jan 1998

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Разработка правил генерации именных словоформ для новописьменных вариантов карельского языка

Abstract

Published Version

Talk to us

Similar Papers

More From: Bulletin of Ugric studies