Abstract

Item analysis of criterion-referenced tests (CRTs) presents several practical problems. Traditional item discrimination indices may be of limited informa tiveness if score distributions are narrow. When no prior defined mastery group is available application of the CRT item difference index is not possible. In settings which have relatively small numbers of examinees, item-response theory (IRT) methods will not yield stable estimates. Like wise, in many language programs either IRT computer programs are unavailable or the results of IRT analysis will be uninformative to those involved in test development. This study examines the relationship of three item discrimination indices and the biserial correlation to IRT item informa tion functions (IIFs) in order to provide testers with information which will be useful in contexts in which IRT analysis is inappropriate. Three indices which indicate item discrimination at the cut-score are compared to IRT results on data from three types of language test data. The indices are the phi-coefficient (Φ), the B-index and the agreement statistic. The three types of language tests are 1) an ESL reading placement test, 2) an ESL reading achievement test and 3) an EFL multiple-choice reading cloze test. Implica tions and cautions for CRT development and analysis are presented.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call