ABSTRACTConstruct: This study examines validity evidence of end-of-rotation evaluation scores used to measure competencies and milestones as part of the Next Accreditation System (NAS) of the Accreditation Council for Graduate Medical Education (ACGME). Background: Since the implementation of the milestones, end-of-rotation evaluations have surfaced as a potentially useful assessment method. However, validity evidence on the use of rotation evaluation scores as part of the NAS has not been studied. This article examines validity evidence for end-of-rotation evaluations that can contribute to developing guidelines that support the NAS. Approach: Data from 2,701 end-of-rotation evaluations measuring 21 out of 22 Internal Medicine milestones for 142 residents were analyzed (July 2013–June 2014). Descriptive statistics were used to measure the distribution of ratings by evaluators (faculty, n = 116; fellows, n = 59; peer-residents, n = 131), by postgraduate years. Generalizability analysis and higher order confirmatory factor analysis were used to examine the internal structure of ratings. Psychometric implications for combining evaluation scores using composite score reliability were examined. Results: Milestone ratings were significantly higher for each subsequent year of training (15/21 milestones). Faculty evaluators had greater variability in ratings across milestones, compared to fellows and residents; faculty ratings were generally correlated with milestone ratings from fellows (r = .45) and residents (r = .25), but lower correlations were found for Professionalism and Interpersonal and Communication Skills. The Φ-coefficient was .71, indicating good reliability. Internal structure supported a 6-factor solution, corresponding to the hierarchical relationship between the milestones and the 6 core competencies. Evaluation scores corresponding to Patient Care, Medical Knowledge, and Practice-Based Learning and Improvement had higher correlations to milestones reported to the ACGME. Mean evaluation ratings predicted problem residents (odds ratio = 5.82, p < .001). Conclusions: Guidelines for rotation evaluations proposed in this study provide useful solutions that can help program directors make decisions on resident progress and contribute to assessment systems in graduate medical education.