Abstract
College Scholastic Ability Test(CSAT) is a primary test to evaluate the study achievement of high-school students and used by most universities for admission decision in South Korea. Because its level of difficulty is a significant issue to both students and universities, the government makes a huge effort to have a consistent difficulty level every year. However, the actual levels of difficulty have significantly fluctuated, which causes many problems with university admission. In this paper, we build two types of data-driven prediction models to predict correct answer rate and to identify significant factors for CSAT English test through accumulated test data of CSAT, unlike traditional methods depending on experts’ judgments. Initially, we derive candidate question-specific factors that can influence the correct answer rate, such as the position, EBS-relation, readability, from the annual CSAT practices and CSAT for 10 years. In addition, we drive context-specific factors by employing topic modeling which identify the underlying topics over the text. Then, the correct answer rate is predicted by multiple linear regression and level of difficulty is predicted by classification tree. The experimental results show that 90% of accuracy can be achieved by the level of difficulty (difficult/easy) classification model, whereas the error rate for correct answer rate is below 16%. Points and problem category are found to be critical to predict the correct answer rate. In addition, the correct answer rate is also influenced by some of the topics discovered by topic modeling. Based on our study, it will be possible to predict the range of expected correct answer rate for both question-level and entire test-level, which will help CSAT examiners to control the level of difficulties.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: KIPS Transactions on Software and Data Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.