Abstract

The aim of the applied research is to build algorithms for searching phraseological units that are compatible with our previously developed model of a linguistic corpus with morphological markup according to spaCy rules. The scientific novelty is due to the fact that for the first time, within the framework of the corpus approach, a set of universal ways to search for phraseological units is proposed with a minimum amount of manual labor and using elements of end-to-end digital technologies. During the study, the technical parameters of phraseological units to be searched were described; the capabilities of the author’s corpus manager within the framework of manual and special manual queries were examined; two algorithms for a two-stage search for individual phraseological units and their groups were developed and tested on the basis of a representative corpus of texts from German-language media; detailed examples of search results were provided. As a result, the consistency of the developed algorithms has been proved and it has been experimentally established that the search error lies in the acceptable range of 0-14.8%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call