Surgical skill evaluation that relies on subjective scoring of surgical videos can be time-consuming and inconsistent across raters. We demonstrate differentiated opportunities for objective evaluation to improve surgeon training and performance. Subjective evaluation was performed using the Global evaluative assessment of robotic skills (GEARS) from both expert and crowd raters; whereas, objective evaluation used objective performance indicators (OPIs) derived from da Vinci surgical systems. Classifiers were trained for each evaluation method to distinguish between surgical expertise levels. This study includes one clinical task from a case series of robotic-assisted sleeve gastrectomy procedures performed by a single surgeon, and two training tasks performed by novice and expert surgeons, i.e., surgeons with no experience in robotic-assisted surgery (RAS) and those with more than 500 RAS procedures. When comparing expert and novice skill levels, OPI-based classifier showed significantly higher accuracy than GEARS-based classifier on the more complex dissection task (OPI 0.93 ± 0.08 vs. GEARS 0.67 ± 0.18; 95% CI, 0.16-0.37; p = 0.02), but no significant difference was shown on the simpler suturing task. For the single-surgeon case series, both classifiers performed well when differentiating between early and late group cases with smaller group sizes and larger intervals between groups (OPI 0.9 ± 0.08; GEARS 0.87 ± 0.12; 95% CI, 0.02-0.04; p = 0.67). When increasing the group size to include more cases, thereby having smaller intervals between groups, OPIs demonstrated significantly higher accuracy (OPI 0.97 ± 0.06; GEARS 0.76 ± 0.07; 95% CI, 0.12-0.28; p = 0.004) in differentiating between the early/late cases. Objective methods for skill evaluation in RAS outperform subjective methods when (1) differentiating expertise in a technically challenging training task, and (2) identifying more granular differences along early versus late phases of a surgeon learning curve within a clinical task. Objective methods offer an opportunity for more accessible and scalable skill evaluation in RAS.
Read full abstract