This study examined the exchangeability of alternative methods for measuring science achievement at the elementary level. Observation of a student performing a hands-on investigation was considered to be the “benchmark” method for science performance assessments. Four less-costly methods — or possible “surrogates” for the benchmark — were, in order of decreasing verisimilitude: (a) A notebook report of an investigation, (b) a computer simulation of an investigation, (c) short-answer questions about an investigation, and (d) multiple-choice questions about an investigation. Exchangeability of each of the four surrogates for the benchmark was examined using three different investigations: “Electric Mysteries,” “Paper Towels,” and “Bugs.” One hundred and ninety-seven fifth- and sixth-grade students were given: (a) All investigations with each method, (b) a multiple-choice science achievement test, and (c) an aptitude test. Results of exchangeability analyses indicated that only the notebook provided a reasonable surrogate for the benchmark. This finding was replicated across the three investigations. Moreover, combinations of surrogates, including the multiple-choice science achievement test, failed to approximate information gained from direct observation of student performance, over and above information provided by the notebook surrogate.
Read full abstract