For the effective organisation of educational processes supported by relevant intelligent learning systems, it is important to choose the right technologies that would ensure individualisation of learning, adequate perception of learning content, and the so-called “understanding” of texts in Ukrainian provided by students (description of the solution to a task, answers provided in their own words, not selected from the test answer options, questions to the system, etc.), prototyping, constant iteration during natural language text recognition and processing, and maximum reliability and efficiency of learning processes. The purpose of the article is to study and analyse various methods of natural language processing, and the concept of NLP, and to consider common problems and prospects for developing a software product for processing Ukrainian-language text in online courses that support intelligent learning systems based on it. The research methods are the main methodological approaches and technological tools for analysing natural language texts in intelligent educational systems and developing a system for supporting NLP (Natural Language Processing) technology in the linguistic analysis of texts in Ukrainian. Such methods include, in particular: systemic and comparative analyses to identify the features of intelligence and information (with elements of intellectualisation) systems; the method of expert evaluation, which involves the study of literary sources and information resources, interviews and surveys of experts, as well as the processes of developing and testing intelligent and information systems. The novelty of the study is the analysis of modern technologies for the development of online educational process support systems through the organisation of processes of perception of information provided by students in natural language, the results of which can be used in the development of their software product to support the educational process in Ukrainian, ensuring the improvement of learning efficiency through the use of NLP technology in the process of studying the relevant academic content. Conclusions. The paper analyses modern NLP methods. The analysis has led to the selection of tokenisation, normalisation, stemming and lemmatisation methods for use in intelligent learning systems in the linguistic analysis of the so-called “free” communication in the natural (Ukrainian) language of students in the process of studying the educational content of online courses. During the tokenisation of Ukrainian-language texts, we solved such problems as eliminating so-called “merged” tokens, correcting spelling mistakes, identifying common prefixes in compound words and their impact on the semantics of the corresponding lexemes, identifying common prefixes in abbreviations, and bringing words to their normal form. Lemmatisation is especially important for the Ukrainian language (with its large number of cases of nouns, adjectives, word forms, etc.) and it requires the use of specially compiled dictionaries of the subject area under consideration. In these dictionaries, word forms are presented in the forms of lemmas (i.e., nouns are presented in the nominative case).
Read full abstract