Multiple bolus trials are administered during clinical and research swallowing assessments to comprehensively capture an individual's swallowing function. Despite valuable information obtained from these boluses, it remains common practice to use a single bolus (e.g., the worst score) to describe the degree of dysfunction. Researchers also often collapse continuous or ordinal swallowing measures into categories, potentially exacerbating information loss. These practices may adversely affect statistical power to detect and estimate smaller, yet potentially meaningful, treatment effects. This study sought to examine the impact of aggregating and categorizing penetration-aspiration scale (PAS) scores on statistical power and effect size estimates. We used a Monte Carlo approach to simulate three hypothetical within-subject treatment studies in Parkinson's disease and head and neck cancer across a range of data characteristics (e.g., sample size, number of bolus trials, variability). Different statistical models (aggregated or multilevel) as well as various PAS reduction approaches (i.e., types of categorizations) were performed to examine their impact on power and the accuracy of effect size estimates. Across all scenarios, multilevel models demonstrated higher statistical power to detect group-level longitudinal change and more accurate estimates compared to aggregated (worst score) models. Categorizing PAS scores also reduced power and biased effect size estimates compared to an ordinal approach, though this depended on the type of categorization and baseline PAS distribution. Multilevel models should be considered as a more robust approach for the statistical analysis of multiple boluses administered in standardized swallowing protocols due to its high sensitivity and accuracy to compare group-level changes in swallowing function. Importantly, this finding appears to be consistent across patient populations with distinct pathophysiology (i.e., PD and HNC) and patterns of airway invasion. The decision to categorize a continuous or ordinal outcome should be grounded in the clinical or research question with recognition that scale reduction may negatively affect the quality of statistical inferences in certain scenarios.