Abstract

The present study was initiated to investigate the comparability of multiple-choice and true-false item formats when the time necessary to respond to each type of item was equated empirically. Also investigated was the relative difficulty of multiple-choice (MC), true true-false (Tf), and false true-false (tF) items mea suring the same content. Results indicated that true-false items result in a less reliable test than one using a four-option MC format, even when empirically determined differences in time needed to answer the respective formats were taken into account. When scores were corrected for guessing, the MC items were significantly easier than the true-false format. CONSIDERABLE DISCUSSION has taken place among measurement specialists regarding the virtues of multiple-choice versus true-false test item for mats. Recent contrasting examples might include .. .the advantages attributed to (true-false) are not, unfortunately, very valid... (3:160), and .. .a few (test specialists) see special virtues of efficiency and ease of preparation in (true-false items) and advocate their wide use (2:1) The most obvious limitation of true-false relative to multiple-choice test items is the degree to which the former is subject to guessing. Several studies have shown that the reliability of a test is directly related to the number of choices per item (1, 4, 5, 6) . Similarly, it would be expected that a multiple choice test would have greater reliability than a true false test if the number of items were held constant. However, since a greater number of true-false items can be administered per unit time, it is possible that in a given amount of time, the increased number of true-false items administered would allow for greater reliability and more efficient sampling of content objectives than had a multiple-choice format been used. Using eighty-eight multiple-choice items from a published test in natural science, Ebel (2) compared formats by rewriting each multiple-choice item as a parallel true-false item. Two forms, each con sisting of forty-four multiple-choice and forty-four true-false items, were developed. Reliabilities (K.R. 20) were computed for the multiple-choice and true-false sections of both forms, and assuming that true-false items could be answered per multiple choice item, the Spearman-Brown formula was used to predict the reliability of an 88-item true false test. For the first form, this adjusted relia bility was greater than the reliability obtained for the multiple-choice section of the test; however, the inverse was true with respect to the second form. The present study was concerned with several currently unanswered questions. First, what is an empirically determined ratio of multiple-choice to equivalent true-false items which can be answered in a given amount of time? Second, for achieve ment test items administered within a classroom situation, which of the two formats under consider ation result in greater reliability per unit of testing time? Third, what is the relative reliability of true true-false and false true-false items when compared to multiple-choice items? Fourth, what ratio of multiple-choice to equivalent true-false items is necessary for producing equal reliability coefficients? Lastly, after equating for differences in the effect of guessing, what is the relative difficulty of the different formats?

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call