Introduction: The Objective Structured Clinical Examination (OSCE) is an established format for practical clinical assessments at most medical schools and discussion is underway in Germany to make it part of future state medical exams. Examiner behavior that influences assessment results is described. Erroneous assessments of student performance can result, for instance, from systematic leniency, inconsistent grading, halo effects, and even a lack of differentiation between the tasks to be performed over the entire grading scale. The aim of this study was to develop a quality assurance tool that can monitor factors influencing grading in a real OSCE and enable targeted training of examiners.Material, Methods and Students: Twelve students at the Medical Faculty of the University of Heidelberg were each trained to perform a defined task for a particular surgical OSCE station. Definitions were set and operationalized for an excellent and a borderline performance. In a simulated OSCE during the first part of the study, the standardized student performances were assessed and graded by different examiners three times in succession; video recordings were made. Quantitative and qualitative analysis of the videos was also undertaken by the study coordinator.In the second part of the study, the videos were used to investigate the examiners’ acceptance of standardized examinees and to analyze potential influences on scoring that stemmed from the examiners’ experience.Results: In the first part of the study, the OSCE scores and subsequent video analysis showed that standardization for defined performance levels at different OSCE stations is generally possible. Individual deviations from the prescribed examinee responses were observed and occurred primarily with increased complexity of OSCE station content.In the second part of the study, inexperienced examiners assessed a borderline performance significantly lower than their experienced colleagues (13.50 vs. 15.15, p=0.035). No difference was seen in the evaluation of the excellent examinees. Both groups of examiners graded the item “ocial competence” – despite identical standardization – significantly lower for examinees with borderline performances than for excellent examinees (4.13 vs. 4.80, p<0.001).Conclusion: Standardization of examinees for previously defined performance levels is possible, making a new tool available in future not only for OSCE quality assurance, but also for training examiners. Detailed preparation of the OSCE checklists and intensive training of the examinees are essential.This new tool takes on a special importance if standardized OSCEs are integrated into state medical exams and, as such, become high-stakes assessments.