데이터마이닝 기법을 활용한 대학수학능력시험 영어영역 정답률 예측 및 주요 요인 분석

Hee Jin Park,Pil Sung Kang,Youn Ho Lee,Kyoung Ye Jang,Woo Je Kim

doi:10.3745/ktsde.2015.4.11.509

Abstract

College Scholastic Ability Test(CSAT) is a primary test to evaluate the study achievement of high-school students and used by most universities for admission decision in South Korea. Because its level of difficulty is a significant issue to both students and universities, the government makes a huge effort to have a consistent difficulty level every year. However, the actual levels of difficulty have significantly fluctuated, which causes many problems with university admission. In this paper, we build two types of data-driven prediction models to predict correct answer rate and to identify significant factors for CSAT English test through accumulated test data of CSAT, unlike traditional methods depending on experts’ judgments. Initially, we derive candidate question-specific factors that can influence the correct answer rate, such as the position, EBS-relation, readability, from the annual CSAT practices and CSAT for 10 years. In addition, we drive context-specific factors by employing topic modeling which identify the underlying topics over the text. Then, the correct answer rate is predicted by multiple linear regression and level of difficulty is predicted by classification tree. The experimental results show that 90% of accuracy can be achieved by the level of difficulty (difficult/easy) classification model, whereas the error rate for correct answer rate is below 16%. Points and problem category are found to be critical to predict the correct answer rate. In addition, the correct answer rate is also influenced by some of the topics discovered by topic modeling. Based on our study, it will be possible to predict the range of expected correct answer rate for both question-level and entire test-level, which will help CSAT examiners to control the level of difficulties.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

데이터마이닝 기법을 활용한 대학수학능력시험 영어영역 정답률 예측 및 주요 요인 분석

Abstract

Talk to us

Similar Papers

More From: KIPS Transactions on Software and Data Engineering

Lead the way for us

Journal: KIPS Transactions on Software and Data Engineering	Publication Date: Nov 30, 2015
Citations: 2

Similar Papers

2015 개정 영어과 교육과정의 성취 수준과 대학수학능력 영어 평가 수준의 일관성 문제
Jungeun Choi ... Youngsoon So
The Korea English Language Testing Association | VOL. 17
Jungeun Choi, et. al.Jungeun Choi ... Youngsoon So
30 Jun 2022
2015 개정 영어과 교육과정의 성취 수준과 대학수학능력 영어 평가 수준의 일관성 문제
Jungeun Choi ... Youngsoon So

Analysis of Science Items of the Japanese National Center Test for University Admissions
...
Journal of the Korean Association for Research in Science Education | VOL. 30
, et. al. ...
01 Jan 2009
Journal of the Korean Association for Research in Science Education | VOL. 30

CEFR 기반 2015 개정 교육과정, 교과서, 수능의 일관성: 영어 읽기를 중심으로
Yunha Choi
English Teaching | VOL. 78
Yunha ChoiYunha Choi
30 Mar 2023
CEFR 기반 2015 개정 교육과정, 교과서, 수능의 일관성: 영어 읽기를 중심으로
Yunha Choi

Difficulty Level of Compulsory Subject ‘Korean History’ of 2017 College Scholastic Ability Test (CSAT) and Course in High School
Yu-Ah Shin
The Korean History Education Review | VOL. 135
Yu-Ah ShinYu-Ah Shin
30 Sep 2015
Difficulty Level of Compulsory Subject ‘Korean History’ of 2017 College Scholastic Ability Test (CSAT) and Course in High School
Yu-Ah Shin

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

데이터마이닝 기법을 활용한 대학수학능력시험 영어영역 정답률 예측 및 주요 요인 분석

Abstract

Talk to us

Similar Papers

More From: KIPS Transactions on Software and Data Engineering