Abstract

Differential item functioning (DIF) refers to a difference in item performance between equally proficient members of two demographic groups. From an item response theory (IRT) perspective, DIF can be defined as a difference between groups in item response functions. The classic example of a DIF item is a mathematics question that contains sports jargon that is more likely to be understood by men than by women. An item of this kind would be expected to manifest DIF against women: They are less likely to give a correct response than men with equivalent math ability. In reality, the causes of DIF are often far more obscure. The recent book by Camilli and Shepard (1994) and the volume edited by Holland and Wainer (1993) provide an excellent background in the history, theory, and practice of DIF analysis. There are several reasons that DIF detection may be more important for computerized adaptive tests (CATs) than it is for nonadaptive tests. Because fewer items are administered in a CAT, each item response plays a more important role in the examinees’ test scores than it would in a nonadaptive testing format. Any flaw in an item, therefore, may be more consequential. Also, an item flaw can have major repercussions in a CAT because the sequence of items administered to the examinees depends in part on their responses to the flawed item. Finally, administration of a test by computer creates several potential sources of DIF which are not present in conventional tests, such as differential computer familiarity, facility, and anxiety, and differential preferences for computerized administration. Legg and Buhr (1992) and Schaeffer, Reese, Steffen, McKinley, and Mills (1993) both reported ethnic and gender group differences in some of these attributes. Powers and O’Neill (1993) reviewed the literature on this topic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call