Quality monitoring of assessment practices should be a priority in all residency programs. Validity evidence is one of the main hallmarks of assessment quality and should be collected to support the interpretation and use of assessment data. Our objective was to identify, synthesize, and present the validity evidence reported supporting different technical skill assessment tools in otolaryngology-head and neck surgery (OTL-HNS). We performed a secondary analysis of data generated through a systematic review of all published tools for assessing technical skills in OTL-HNS (n = 16). For each tool, we coded validity evidence according to the five types of evidence described by the American Educational Research Association's interpretation of Messick's validity framework. Descriptive statistical analyses were conducted. All 16 tools included in our analysis were supported by internal structure and relationship to variables validity evidence. Eleven articles presented evidence supporting content. Response process was discussed only in one article, and no study reported on evidence exploring consequences. We present the validity evidence reported for 16 rater-based tools that could be used for work-based assessment of OTL-HNS residents in the operating room. The articles included in our review were consistently deficient in evidence for response process and consequences. Rater-based assessment tools that support high-stakes decisions that impact the learner and programs should include several sources of validity evidence. Thus, use of any assessment should be done with careful consideration of the context-specific validity evidence supporting score interpretation, and we encourage deliberate continual assessment quality-monitoring. NA. Laryngoscope, 128:2296-2300, 2018.