Live feedback in the operating room is essential in surgical training. Despite the role this feedback plays in developing surgical skills, an accepted methodology to characterize the salient features of feedback has not been defined. To quantify the intraoperative feedback provided to trainees during live surgical cases and propose a standardized deconstruction for feedback. In this qualitative study using a mixed methods analysis, surgeons at a single academic tertiary care hospital were audio and video recorded in the operating room from April to October 2022. Urological residents, fellows, and faculty attending surgeons involved in robotic teaching cases during which trainees had active control of the robotic console for at least some portion of a surgery were eligible to voluntarily participate. Feedback was time stamped and transcribed verbatim. An iterative coding process was performed using recordings and transcript data until recurring themes emerged. Feedback in audiovisual recorded surgery. The primary outcomes were the reliability and generalizability of a feedback classification system in characterizing surgical feedback. Secondary outcomes included assessing the utility of our system. In 29 surgical procedures that were recorded and analyzed, 4 attending surgeons, 6 minimally invasive surgery fellows, and 5 residents (postgraduate years, 3-5) were involved. For the reliability of the system, 3 trained raters achieved moderate to substantial interrater reliability in coding cases using 5 types of triggers, 6 types of feedback, and 9 types of responses (prevalence-adjusted and bias-adjusted κ range: a 0.56 [95% CI, 0.45-0.68] minimum for triggers to a 0.99 [95% CI, 0.97-1.00] maximum for feedback and responses). For the generalizability of the system, 6 types of surgical procedures and 3711 instances of feedback were analyzed and coded with types of triggers, feedback, and responses. Significant differences in triggers, feedback, and responses reflected surgeon experience level and surgical task being performed. For example, as a response, attending surgeons took over for safety concerns more often for fellows than residents (prevalence rate ratio [RR], 3.97 [95% CI, 3.12-4.82]; P = .002), and suturing involved more errors that triggered feedback than dissection (RR, 1.65 [95% CI, 1.03-3.33]; P = .007). For the utility of the system, different combinations of trainer feedback had associations with rates of different trainee responses. For example, technical feedback with a visual component was associated with an increased rate of trainee behavioral change or verbal acknowledgment responses (RR, 1.11 [95% CI, 1.03-1.20]; P = .02). These findings suggest that identifying different types of triggers, feedback, and responses may be a feasible and reliable method for classifying surgical feedback across several robotic procedures. Outcomes suggest that a system that can be generalized across surgical specialties and for trainees of different experience levels may help galvanize novel surgical education strategies.
Read full abstract