—The LingvoDoc system (http://lingvodoc.ispras.ru) provides a service for collaborative language documentation and computations on the collected data. This software system provides GraphQL HTTP API for all the system components and allows its users to build their own extensions for data analysis or even to integrate it with their own software. Thanks to a special database and application design pattern, it is possible to construct offline applications integrated with the LingvoDoc system: these applications would need to have an internet connection only once to synchronize basic data types and for authentification purposes. The system itself allows users to construct multilayer dictionaries, attach them to the geographical map, fill documents with metadata, share access to dictionaries with other users or with everyone. The LingvoDoc system provides fine-grained access control lists for sharing, which allows to separate users into groups of dictionary editors, proofreaders and read-only users. The system also provides some computational algorithms on the stored data: phonology computations, automatic and guided deduplication inside the dictionaries etc. The system allows users to choose the dictionary structure. The system supports the following data types: text, images, sounds (wav, mp3, and flac), markups (ELAN and Praat formats), directed and undirected links between stored entities. A user can choose the most suitable format for their dictionary. Also, the system provides ELAN corpora storage, viewer and processing. In LingvoDoc there are 13 programs made for authors of the dictionary (only 4 of them are available for all users of the system). These programs analyze language data from phonetical, morphological and etymological point of view. This analysis previously was performed manually by linguists. Our programs allows do it tens and sometimes hundreds times faster. This paper presents the documentation and an analysis of Ob-Ugric languages using the LingvoDoc system.
Read full abstract