Abstract
ABSTRACT This study explores machine learning (ML) for assessing elementary students’ scientific argumentation skills, addressing the inefficiency of traditional teacher-based evaluations. We developed tailored test questions and scoring rubrics, collecting data from sixth-graders in China. Four shallow learning algorithms (Polynomial Naive Bayes, k-NN, Random Forest, Logistic Regression) and two deep learning models (TextCNN, Bi-LSTM+Attention) were compared. Results showed Bi-LSTM+Attention outperformed others with 85.87% accuracy and strong human-machine consistency, demonstrating practical applicability. The findings offer an effective tool for quantitative assessment of argumentation skills and provide empirical support for ML applications in educational evaluation. This approach reduces labor intensity while enhancing assessment objectivity, advancing both theoretical frameworks and classroom implementation strategies for competency-based education.
Published Version
Join us for a 30 min session where you can share your feedback and ask us any queries you have