A Comparison of Item Parameter Estimates and ICCs Produced with TESTGRAF and BILOG under Different Test Lengths and Sample Sizes.

Liane N Patsula ,Marc E Gessaroli

doi:10.20381/ruor-8018

Abstract

Among the most popular techniques used to estimate item response theory (IRT) parameters are those used in the LOGIST and BILOG computer programs. Because of its accuracy with smaller sample sizes or differing test lengths, BILOG has become the standard to which new estimation programs are compared. However, BILOG is still complex and labor-intensive, and the sample sizes required are still rather large. For this reason, J. Ramsay developed the program TESTGRAF (1989), which uses nonparametric IRT techniques. Ramsay has claimed that TESTGRAF is much faster than using some of the common parametric approaches in LOGIST and BILOG, that there is no loss of efficiency, and that as few as 100 examinees and 20 test questions are needed to estimate item characteristic curves (ICCs). The study examined effects of varying sample size (N=100, 250, 500, and 1,000) and test length (20 and 40 items) on the accuracy and consistency of three-parameter logistic model item parameter estimates and ICCs from TESTGRAF and BILOG. Overall, TESTGRAF seemed to perform better or just as well as BILOG. When large bias effect sizes existed, in all but one case, TESTGRAF was more accurate than BILOG. TESTGRAF was slightly less accurate than BILOG in estimating the parameter with a sample size of 1,000 and in estimating the c parameter at all sample sizes. (Contains 8 tables, 7 figures, and 25 references.) (SLD) ******************************************************************************** * Reproductions supplied by EDRS are the best that can be made * * from the original document. * ******************************************************************************** A Comparison of Item Parameter Estimates and ICCs Produced with TESTGRAF and BILOG Under Different Test Lengths and Sample Sizes

Full Text