ObjectiveTo determine the impact of item-writing flaws and cognitive level on student performance metrics in 1 course series across 2 semesters at a single institution. MethodsFour investigators reviewed 928 multiple-choice items from an integrated therapeutics course series. Differences in performance metrics were examined between flawed and standard items, flawed stems and flawed answer choices, and cognitive levels. ResultsReviewers found that 80% of the items were flawed, with the most common types being implausible distractors and unfocused stems. Flawed items were generally easier than standard ones, but the type of flaw significantly impacted the difficulty. Items with flawed stems had the same difficulty as standard items; however, those with flawed answer choices were significantly easier. Most items tested lower-level skills and have more flaws than higher-level items. There was no significant difference in difficulty between lower- and higher-level cognitive items, and higher-level items were more likely to have answer flaws than item flaws. ConclusionItem-writing flaws differently impact student performance. Implausible distractors artificially lower the difficulty of questions, even those designed to assess higher-level skills. This effect contributes to a lack of significant difference in difficulty between higher- and lower-level items. Unfocused stems, on the other hand, likely increase confusion and hinder performance, regardless of the question’s cognitive complexity.
Read full abstract