The Method for Comprehensive Quality Evaluation of Tests. Part 1

L P Perkhun,N M Tovmachenko,V M Kukharenko

doi:10.31767/su.3(82)2018.03.04

Abstract

Informatization of the modern society has led to the wide-scale and rapid introduction of distance training technologies in virtually all the categories of Ukrainian HEEs. Studies of the aspects related to applications of digital technologies in education processes are subject to close attention in Ukraine and beyond. An important component of the training process is test control of knowledge.  Education activities at the National Academy of Statistics, Accounting and Audit rely on criteria oriented tests. They are realized on the basis of distance training system Moodle that allows for creating test questions of various types and their repeated inclusion in various packages of test tasks. The Moodle environment allows for computation of selected statistical indicators on a fulfilled test and its individual tasks: average estimate and median, standard deviation, asymmetry, excess, internal agreement rate, standard error etc. However, these characteristics are not enough for justified acceptance of test results.  The article presents the first phase in elaborating a comprehensive method for quality evaluation of selected test tasks and the test as a whole. This method combining the classical theory, Data Mining and Item Response theory methods involves six steps. The first step, based on indicators of descriptive statistics, allows for evaluating the obtained distribution of test results. The second step involves evaluation of the validity of test tasks. The point-bead ratio is computed to derive the correlation between individual test task and individual test score of a student, with values higher than 0.5 considered satisfactory. Pearson correlation coefficient for binary variables shows the correlation between pairs of test tasks. The test tasks with negative correlations with the other test tasks are not considered as valid, and they have to be corrected or replaced. At the third step, the factor validity of the test is evaluated. The test tasks combined in groups using factor analysis methods are subject to further analysis to determine their impact on the final result, the individual test score of a student. All the above mentioned steps are illustrated by example. The computation is made by SPSS software package. The difference in interpretation of the computation results in each step for norm oriented and criteria oriented tests is demonstrated.  The description of further steps involves in the method for comprehensive quality evaluation of tests, which use Data Mining and Item Response Theory methods, will be continued in next publications.

Highlights

Розпочато виклад методики комплексного оцінювання якості тестів, що поєднує класичну теорію, методи Data Mining та Item Response Theory
Informatization of the modern society has led to the wide-scale
Studies of the aspects related to applications of digital technologies

Summary

СОЦІАЛЬНА СТАТИСТИКА

Розпочато виклад методики комплексного оцінювання якості тестів, що поєднує класичну теорію, методи Data Mining та Item Response Theory. Мушеника присвячені дослідженню вбудованих у середовище дистанційного навчання Moodle статистичних показників якості тестових завдань. Аналіз та опис програмних засобів для оцінювання якості тестових завдань наведено у роботах В. Особливостями застосування комп’ютерних технологій тестування та теоретичними аспектами статистичного оцінювання якості тестів займалися Л. Метою дослідження є розробка комплексної методики оцінки якості окремих тестових завдань і тесту загалом. Середовище Moodle пропонує розрахунок деяких статистичних показників щодо виконаного тесту та окремих його завдань: середня оцінка та медіана, стандартне відхилення, асиметрія, ексцес, коефіцієнт внутрішньої згоди, стандартна похибка тощо. Авторами запропоновано комплексну методику оцінювання якості окремих тестових завдань і тесту загалом Аналізу передує розрахунок індивідуальних тестових балів студентів. Якісний НО тест повинен ілюструвати нормальний розподіл індивідуальних балів студентів, його середнє значення має збігатися з модою, а більшість ISi (близько 68%) групується навколо середнього. На рис. 2 зображено гістограму розподілу індивідуальних тестових балів студентів та її порівняння з кривою нормального розподілу

Стандартне відхилення

Номер тестового завдання

Факторні навантаження на тестові завдання

Індивідуальний тестовий бал