ObjectiveTo determine the impact of item-writing flaws and cognitive level on student performance metrics in one course series across two semesters at a single institution. MethodsFour investigators reviewed 928 multiple-choice items from an integrated therapeutics course series. Differences in performance metrics were examined between flawed and standard items, flawed stems and flawed answer choices, and cognitive levels. ResultsReviewers found that 80% of the items were flawed, with the most common types being implausible distractors and unfocused stems. Flawed items were easier than standard ones, with items becoming easier as they accumulate more flaws. These flaws benefited lower-performing students more than higher-performing ones, particularly up to two flaws. Items with flawed stems had the same difficulty as standard items, but those with flawed answer choices were significantly easier. Most items tested lower-level skills and have more flaws than higher-level items. There was no significant difference in difficulty between lower- and higher-level cognitive items, and higher-level items were more likely to have answer flaws than stem flaws. ConclusionItem-writing flaws differently impact student performance. Implausible distractors artificially lower the difficulty of questions, even those designed to assess higher-level skills. This effect contributes to a lack of significant difference in difficulty between higher- and lower-level items. Unfocused stems, on the other hand, likely increase confusion and hinder performance regardless of the question's cognitive complexity.