Probing for Bias: Comparing Populations Using Item Response Curves

Paul Walter,Crisel Suarez,Edward Nuhfer

doi:10.5038/1936-4660.14.1.1357

Abstract

We introduce an approach for making a quantitative comparison of the item response curves (IRCs) of any two populations on a multiple-choice test instrument. In this study, we employ simulated and actual data. We apply our approach to a dataset of 12,187 participants on the 25-item Science Literacy Concept Inventory (SLCI), which includes ample demographic data of the participants. Prior comparisons of the IRCs of different populations addressed only two populations and were made by visual inspection. Our approach allows for quickly comparing the IRCs for many pairs of populations to identify those items where substantial differences exist. For each item, we compute the IRC dot product, a number between 0 and 1 for which a value of 1 occurs when the IRCs of the two populations are identical. We then determine whether the value of the IRC dot product is indicative of significant differences in populations of real students. Through this process, we can quickly discover bias across demographic groups. As a case example, we apply our metric to illuminate four SLCI items that exhibit gender bias. We further found that gender bias was present for non-science majors on those items but not for science majors.

Full Text