Abstract

In most of widely used distance education platforms which are named as MOOC (Massive Open Online Courses) language of lectures are English, but even so, they have participants from a lot of different countries. This situation causes differences in learners usage behaviors and performances. In our previous studies we tried to divide the users into language groups according to their English language proficiency. In this study, with natural language processing techniques we aimed to improve the division of language groups of students and automatically generate datasets which belong to language groups from a distance education platform named as FutureLearn. In FutureLearn platform (like other distance education platforms), learners do not have to provide their country information while registering. Also for some of the learners, provided country information belongs to where they currently live which is different from their home country. In such situations, it is not possible to determine whether English is their first, official or secondary language. Our study focused on using regex patterns to update learners language groups' labels with aim of using them in future studies like predicting the learners' language groups. As data source the datasets of «Understanding Language: Learning and Teaching-4» course on the FutureLearn platform is used. To update the language groups with natural language processing we mostly used features like learners' comments, ids, and country information. As a result of this study, with the analysis of the comments of the users, we identified 63.06% of all commented users' language groups which consist of English as official and primary language, English is official but not primary language and English is not official language. It is observed that 78.19% of these learners belong to the same language group as their provided country information in registration progress and 21.81% of users groups' home country is different from their language group which is identified from their comments. When we just use their country information (the information provided in registration step) number of English language group identified learners were lower and identified learners' language groups could be wrong.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call