Abstract

Evaluation of text accessibility seems to be an extremely urgent and labor-consuming task in the process of preparing texts for teaching Russian as a foreign language. On the other hand, the procedure of assigning a text to one of the levels on the CEFR scale (from A1 to C2) is well-formalized and described in the professional literature, which opens opportunities for its automation. This paper presents Textometr - a new free web-based tool for estimating CEFR level and other key statistics from any given text in Russian that can be relevant for adapting it for foreign students. The automated assessment of the text level here is based on a regression model, trained on the dataset of more than 800 texts from Russian textbooks for foreigners, applying several machine learning and natural language processing methods. In addition to the CEFR level, the tool provides information relevant for adapting the text to educational tasks: lists of keywords and words for a potential vocabulary list, statistics on the text coverage by frequency lists and CEFR-graded vocabulary lists (lexical minima), a frequency list of the text, a forecast of the time needed for reading. The tool shortages at the current stage of development and suggested ways to solve them are also discussed. Finally, the results of the test on the tool quality and the vectors for its further development are reported. Textometr can provide helpful information not only to teachers and guidance teachers, but to authors of textbooks and publishers to check the compliance of the text content with the declared level and educational goals.

Highlights

  • This paper presents Textometr – a new free web-based tool for estimating CEFR level and other key statistics from any given text in Russian that can be relevant for adapting it for foreign students

  • The automated assessment of the text level here is based on a regression model, trained on the dataset of more than 800 texts from Russian textbooks for foreigners, applying several machine learning and natural language processing methods

  • Textometr can provide helpful information to teachers and guidance teachers, but to authors of textbooks and publishers to check the compliance of the text content with the declared level and educational goals

Read more

Summary

Научная статья

Текстометр: онлайн инструмент определения уровня сложности текста по русскому языку как иностранному. Цель исследования – описать возможности и методику использования нового онлайн-инструмента «Текcтометр» для автоматического анализа уровня сложности текста по шкале CEFR и его подготовки к уроку русского языка в иностранной аудитории. Материалом для построения математической модели по определению уровня текста послужили более чем 800 текстов из современных учебников по русскому языку как иностранному. Ключевые слова: русский язык как иностранный, учебный текст, сложность текста, обучение чтению, адаптация текстов, компьютерная лингводидактика, компьютерные технологии, преподавание русского языка, интернет-ресурсы, обучение русскому языку. Текстометр: онлайн-инструмент определения уровня сложности текста по русскому языку как иностранному // Русистика. Цель исследования – описать возможности и методику использования онлайн-инструмента «Текстометр» для оценки сложности русского текста как иностранного

Методы и материалы
CEFR levels Уровни CEFR
Самые полезные слова
Frequency list of the text
Список литературы
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call