Abstract

Research presented in this paper deals with the unknown behavior pattern of students in the blended learning environment. In order to improve prediction accuracy it was necessary to determine the methodology for students` activities assessments. The Training set was created by combining distributed sources – Moodle database and traditional learning process. The methodology emphasizes data mining preprocessing phase: transformation and features selection. Information gain, Symmetrical Uncert Feature Eval, RelieF, Correlation based Feature Selection, Wrapper Subset Evaluation, Classifier Subset Evaluator features selection methods were implemented to find the most relevant subset. Statistical dependence was determined by calculating mutual information measure. Naïve Bayes, Aggregating One-Dependence Estimators, Decision tree and Support Vector Machines classifiers have been trained for subsets with different cardinality. Models were evaluated with comparative analysis of statistical parameters and time required to build them. We have concluded that the RelieF, Wrapper Subset Evaluation and mutual information present the most convenient features selection methods for blended learning environment. The major contribution of the presented research is selecting the optimal low-cardinal subset of students’ activities and a significant prediction accuracy improvement in blended learning environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call