Abstract

ABSTRACTTraditionally, formulas to adjust multiple‐choice test scores for success from guessing have assumed that all guessing is purely random, although it is recognized that examinees attempt to narrow options.The major purpose of this investigation was to see if groups of examinees, with similar ratios of right to wrong answers, differ sufficiently with respect to an analysis of the mean number of effective distracters (E) (a measure of the degree to which the number of item options were narrowed before guessing) that the E results for each group may be inferred to the individuals within that group.The second purpose was to develop and study characteristics of scoring methods that take into account individual differences in narrowing behavior.The data were sampled from two recent administrations of the Scholastic Aptitude Test–two verbal and two mathematical sections totaling 12,140 examinees.Statistically significant variance analyses for all sections supported the conclusion that groups differentiated on the ratio of right to wrong answers (Ratio Groups), came from populations with different mean E values and that narrowing behavior increases as the right to wrong ratio increases. From Ratio Group variance homogeneity tests it was concluded that as the ratio increases, item differences in susceptibility to narrowing increase.Differences among curves drawn through Ratio Group means were very small for all sections of the SAT studied. However, the differences led to the conclusion that tests of different content may be susceptible to different patterns of narrowing behavior. It was also suggested that a smoothed curve through mean E values for various Ratio Groups will intersect a similarly generated curve for various combined results of all the residual Ratio Groups, at a point at which examinees receive maximum benefit from narrowing, in relation to their knowledge.The new scoring techniques used mean E values as best estimates for all test items. Curvilinear correlations between ratios and item E values indicated that the single estimate is a reasonably good predictor. However, an examination of the E curves for clusters of similar items showed that prediction curves for clusters of similar items could increase the importance of the ratio as a predictor of distracter effectiveness.Four smaller samples from each SAT section were each scored in four ways. Method A corrected in direct relation to the amount of narrowing before guessing. Method B attempted to correct scores in accordance with the theoretically based amount that examinees benefit from narrowing in relation to their knowledge. The other methods were number of Right answers and Conventional Formula scoring.The consistency of differences among intercorrelations of the scoring methods supported the conclusions that different tests of the same content and tests of different content display the same relationships among the four methods, and that in terms of ranking examinees differently from number of Right answers scoring, the order of increasing difference is from Conventional Formula to Method A to Method B.The sixteen smaller samples were examined from the viewpoint of item analysis. Relationships between ratios and the two standard scoring methods left enough variance unexplained that it was concluded that grouping examinees by ratio rather than test score might yield a less contaminated picture of option attractiveness. It was also concluded that ratio means and standard deviations could be valuable as additional criteria for parallel test forms.Correlations between item E values and difficulty estimates supported the conclusion that in a well‐written test the relationship between item difficulty and number of effective distracters is very low.For the samples studied there were small but consistent reductions in reliability between number of Right answers and Conventional Formula scoring. Even smaller reductions were found between Conventional Formula and either Method A or Method B. It was concluded that correction formulas taking the ratio into account are only slightly less reliable than the Conventional Formula.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call