ABSTRACT Researchers who work with ordinal rating scales sometimes encounter situations where the scale categories do not function in the intended or expected way. For example, participants’ use of scale categories may result in an empirical difficulty ordering for the categories that does not match what was intended. Likewise, the level of distinction between categories may be imprecise, where there are only minimal differences between categories or categories that reflect too wide of a range of the latent variable to be useful. In these cases, researchers sometimes consider combining categories to minimize the impact of these characteristics on the interpretability of scores. Although this practice is relatively common, the psychometric literature provides limited guidance related to specific strategies for and the psychometric consequences of combining categories. We used empirical data from a self-efficacy scale to illustrate rating scale malfunctioning in a real-world context and demonstrate the impact of various category-collapsing schemes on indicators of rating scale functioning. Then, we used a simulation study to consider how these category-collapsing schemes may impact rating scale functioning under different conditions. Overall, our analyses illustrate different approaches to category collapsing and their impact on psychometric results. Our results suggested that aligning the collapsing scheme with the location of the rating scale malfunctioning was generally effective in improving scale performance. We discuss implications for research and practice and identify several directions for further investigation.