Abstract

The purpose of this study is to determine whether English test items of Undergraduate Placement Exam (UPE) in 2016 contain differential item functioning (DIF) and differential bundle functioning (DBF) in terms of gender and school type and examine the possible sources of bias of DIF items. Mantel Haenszel (MH), Simultaneous Item Bias Test (SIBTEST) and Multiple Indicator and Multiple Causes (MIMIC) methods were used for DIF analyses. DBF analyses were conducted by MIMIC and SIBTEST methods. Expert opinions were consulted to determine the sources of bias. Data set of the study consisted of responses of 59818 students to 2016 UPE English test. As a result of the analyses carried out on 60 items, it was seen that one item in translation subtest contained DIF favoring male students. In school type based analyses, it was concluded that there were nine DIF items in vocabulary and grammar knowledge subtest, six DIF items in reading comprehension subtest and four DIF items in translation subtest. Experts stated that one item containing DIF by gender was unbiased, and evidence of bias was found in thirteen of nineteen items that contained DIF by school type. According to DBF analyses, it was found that some item bundles contained DBF with respect to gender and school type. As a result of research, it was discovered that there were differences with regard to the number of DIF items identified by three methods and the level of DIF that the items contained; however, methods were consistent in detecting uniform DIF.

Highlights

  • Large scale tests are commonly used throughout the world with the aim of selection and placement of the students

  • This study is a descriptive research as it investigates differential item functioning (DIF) and differential bundle functioning (DBF) of English test items in Undergraduate Placement Exam (UPE), and it is a qualitative research because it examines the possible sources of bias in DIF items

  • Mantel Haenszel (MH), Simultaneous Item Bias Test (SIBTEST) and Multiple Indicator and Multiple Causes (MIMIC) methods, which are based on Classical Test Theory (CTT), Item Response Theory (IRT) and Confirmatory Factor Analysis (CFA) respectively, were used for analyses

Read more

Summary

Introduction

Large scale tests are commonly used throughout the world with the aim of selection and placement of the students. To make fair and right decisions based on the test results, and select students who have the ability and interest in accordance with the departments, the ability to be measured in the test must be evaluated accurately. It is significant to have well-qualified items for the tests. Probability of answering an item correctly must not be influenced. Tools in Educ., Vol 6, No 1, (2019) pp. 48-62 by variables such as examinees’ socio economic status, gender or school type they studied. Test becomes biased and might not reflect examinees’ cognitive abilities

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call