ABSTRACT Linear logistic test models (LLTMs), leveraging item response theory and linear regression, offer an elegant method for learning about item characteristics in complex content areas. This study used LLTMs to model single-best-answer, multiple-choice-question response data from two medical subspecialty certification examinations in multiple years and found that word count, proportion of complex words, number of options (3- vs. 4-option), whether including an image, nature of the question task (identifying risks, diagnostic test, management), and whether including application context significantly predicted item difficulty in one or both of the Critical Care Medicine and Pediatric Anesthesiology exams. The differences in the item characteristics that were significant predictors of item difficulty and their associated coefficient estimates between the two exams suggest possible domain differences. This study highlights the possibilities and challenges of using LLTMs to identify item characteristics for complex assessments. The results may help inform or expedite item writing and reviewing processes.