Abstract

This study analyzed student test performance on the mock TOEIC in order to investigate the validity and reliability of the test items within a framework of Classical Test Theory (CTT) and Item Response Theory (IRT). The test, including 49 listening and 51 reading multiple-choice items, was administered to a total of 98 registered university students. Their responses to each test item were statistically analyzed with respect to four indices: item difficulty, item discrimination, item guessing and distractor’s attractiveness. For the analysis, each item was analyzed by applying CTT and IRT according to a 3-parameter logistic model with Pearson correlation coefficient and Kuder Richardson-20 (KR-20). According to the item analysis, it was found that the test displayed a high level of reliability of .920, and the difficulty level was .563, indicating that it was in the desirable range between .25 and .75. The mean discrimination power was also .368, which is within the desirable range of over .3. However, further statistical and qualitative item analyses found that the mock TOEIC contained test items within an inappropriate difficulty level, low item discrimination and implausible distractors. Therefore, this study provides suggestions for enhancing the quality of the mock TOEIC, which can also apply to future TOEICs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call