Construct: The Ottawa Surgical Competency Operating Room Evaluation (O-SCORE) is a 9-item surgical evaluation tool designed to assess technical competence in surgical trainees using behavioral anchors. Background: The initial development of the O-SCORE produced evidence for valid results. Further work is required to determine if the use of a single surgeon or an unblinded rater introduces bias. In addition, the relationship of the O-SCORE to other currently used technical assessment tools should be explored to provide validity evidence related to the relationship to other measures. We have designed this project to provide continued validity evidence for the O-SCORE related to these two issues. Approach: Nineteen residents and 2 staff Orthopedic Surgeons from the University of Ottawa volunteered to participate in a 2-part OSCE style station. Participants completed a written questionnaire followed by a videotaped 10-minute simulated open reduction and internal fixation of a midshaft radius fracture. Videos were rated individually by 2 blinded staff orthopedic surgeons using an Objective Structured Assessment of Technical Skills (OSATS) global rating scale, an OSATS checklist, and the O-SCORE in random order. Results: O-SCORE results appeared sensitive to surgical training level even when raters were blinded. In addition, strong agreement between two independent observers using the O-SCORE suggests that the measure captures a performance easily recognized by surgical observers. Ratings on the O-SCORE also were strongly associated with global ratings on the currently most validated technical evaluation tool (OSATS). Collectively, these results suggest that the O-SCORE generates accurate, reproducible, and meaningful results when used in a randomized and blinded fashion, providing continued validity evidence for using this tool to evaluate surgical trainee competence. Conclusions: The O-SCORE was able to differentiate surgical trainee level using blinded raters providing further evidence of validity for the O-SCORE. There was strong agreement between two independent observers using the O-SCORE. Ratings on the O-SCORE also demonstrated equivalence to scores on the most validated technical evaluation tool (OSATS). These results suggest that the O-SCORE demonstrates accurate and reproducible results when used in a randomized and blinded fashion providing continued validity evidence for this tool in the evaluation of surgical competence in the trainees.