Data Mining Techniques in Analyzing Process Data: A Didactic.

Xin Qiao,Hong Jiao

doi:10.3389/fpsyg.2018.02231

Abstract

Due to increasing use of technology-enhanced educational assessment, data mining methods have been explored to analyse process data in log files from such assessment. However, most studies were limited to one data mining technique under one specific scenario. The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data. The USA sample (N = 426) from the 2012 Program for International Student Assessment (PISA) responding to problem-solving items is extracted to demonstrate the methods. After concrete feature generation and feature selection, classifier development procedures are implemented using the illustrated techniques. Results show satisfactory classification accuracy for all the techniques. Suggestions for the selection of classifiers are presented based on the research questions, the interpretability and the simplicity of the classifiers. Interpretations for the results from both supervised and unsupervised learning methods are provided.

Highlights

The current study demonstrates the usage of four frequently used supervised techniques, including Classification and Regression Trees (CART), gradient boosting, random forest, support vector machine (SVM), and two unsupervised methods, Self-organizing Map (SOM) and k-means, fitted to one assessment data
With the advance of technology incorporated in educational assessment, researchers have been intrigued by a new type of data, process data, generated from computer-based assessment, or new sources of data, such as keystroke or eye tracking data
What analyses should be performed on such process data? Even though specific analytic methods are to be used for different data sources with specific features, some common analysis methods can be performed based on the generic characteristics of log files

Summary

Introduction

With the advance of technology incorporated in educational assessment, researchers have been intrigued by a new type of data, process data, generated from computer-based assessment, or new sources of data, such as keystroke or eye tracking data Most often, such data, often referred to as “data ocean,” is of very large volume and with few ready-to-use features. To take the temporal information into account, hierarchical vectorization of the rank ordered time intervals and the time interval distribution of event pairs were introduced In addition to these common analytic techniques, other existing data analytic methods for process data are Social Network Analysis (SNA; Zhu et al, 2016), Bayesian Networks/Bayes nets (BNs; Levy, 2014), Hidden Markov Model (Jeong et al, 2010), Markov Item Response Theory (Shu et al, 2017), diagraphs (DiCerbo et al, 2011) and process mining (Howard et al, 2010). Modern data mining techniques, Data Mining in Process Data including cluster analysis, decision trees, and artificial neural networks, have been used to reveal useful information about students’ problem-solving strategies in various technologyenhanced assessments (e.g., Soller and Stevens, 2007; Kerr et al, 2011; Gobert et al, 2012)

Objectives

Methods

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Psychology	Publication Date: Nov 23, 2018
Citations: 50	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Data Mining Techniques in Analyzing Process Data: A Didactic.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Psychology

Lead the way for us

Similar Papers

Methodological progress note: Machine learning methods in healthcare research.
Colin Rogerson ... Matt Hall
Journal of Hospital Medicine | VOL. 18
Colin Rogerson, et. al.Colin Rogerson ... Matt Hall
13 Mar 2023
Journal of Hospital Medicine | VOL. 18

The Classification Performance and Mechanism of Machine Learning Algorithms in Winter Wheat Mapping Using Sentinel-2 10 m Resolution Imagery
Peng Fang ... Yuanzheng Wang
Applied Sciences | VOL. 10
Peng Fang, et. al.Peng Fang ... Yuanzheng Wang
23 Jul 2020
Applied Sciences | VOL. 10

Unsupervised learning on scientific ocean drilling datasets from the South China Sea
...
Frontiers of Earth Science | VOL. 13
, et. al. ...
04 Jun 2018
Frontiers of Earth Science | VOL. 13

Machine Learning Algorithms to Detect Sex in Myocardial Perfusion Imaging.
...
Frontiers in cardiovascular medicine | VOL. 8
, et. al. ...
29 Oct 2021
Frontiers in cardiovascular medicine | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Mining Techniques in Analyzing Process Data: A Didactic.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Psychology