Abstract

Every time we celebrated Easter together, my friend would hide eggs filled with candy and special coins for the kids to have fun egg-hunting. After all the eggs had been found—sometimes with kind help from those who had done the hiding work—we would enjoy seeing the kids blow bubbles. They never tire of having fun with bubbles. It turns out psychometricians can have fun with bubbles too, although gaining insights is really the main purpose. Digital assessments allow for the development of Technology-Enhanced Items (TEIs) that create richer interactions with content, often resulting in response patterns beyond conventional multiple-choice items. Traditionally, test developers specify scoring rubrics that map response patterns into item scores (e.g., 1,2,3). Scoring rubrics are often self-evident for multiple-choice items and simple TEIs. However, for items eliciting large numbers of response patterns (some unanticipated), developing scoring rubrics based on expert judgment alone (absent data) can be challenging. Bubble plots are particularly helpful in informing scoring rubric development and/or item revision. In bubble plots, each distinct observed response pattern is represented by a bubble; in this example, there are 306 bubbles - one for each observed response pattern. Bubble size is proportional to the number of test-takers providing the given response pattern. Bubbles are aligned along the x-axis according to the average EAP score of test-takers represented by the bubble. Bubbles are placed on the y-axis according to the item score awarded to that specific response pattern, determined by the scoring rubric. Average ability estimates across all observed response patterns for each item score are on the y-axis in parentheses. For visual ease, all responses earning the same item score have the same color (e.g., all green bubble responses receive full credit). EAP is used as the ability estimate, but other estimates such as raw test score could be used as well. Bubble plots provide an intuitive way to connect psychometric properties of items with response patterns. For example, test takers with high ability tend to score higher on an item. But bubbles spanning a wide ability range at a given item score level might suggest an item has low power to differentiate among test takers at different ability levels. Such observations can sharpen focus on how specific responses are handled in scoring rubrics. Interested readers may contact Edward Kulick (ekulick@ets.org) for questions on this graphic. For bubble plots in general, I would recommend both authors and readers pay attention to the bubble size that can be determined by either the diameter or the area. For kids who have fun with bubbles, they do not care about the difference between the diameter and the area—the larger the bubble is, the better. But for scientific presentation, it is crucial for the authors and readers to be on the same scale because doubling the diameter of a bubble (shown as a disk) is equivalent to quadrupling its area!

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.