Abstract

Crowdsourcing from the general population is an efficient, inexpensive method of surgical performance evaluation. In this study, we compared the discriminatory ability of experts and crowdsourced evaluators (the Crowd) to detect differences in robotic automated performance metrics (APMs). APMs (instrument motion tracking and events data directly from the robot system) of anterior vesico-urethral anastomoses (VUAs) of robotic radical prostatectomies were captured by the dVLogger (Intuitive Surgical). Crowdsourced evaluators and four expert surgeons evaluated video footage using the Global Evaluative Assessment of Robotic Skills (GEARS) (individual domains and total score). Cases were then stratified into performance groups (high versus low quality) for each evaluator based on GEARS. APMs from each group were compared using the Mann-Whitney U test. 25 VUAs performed by 11 surgeons were evaluated. The Crowd displayed moderate correlation with averaged expert scores for all GEARS domains (r > 0.58, p < 0.01). Bland-Altman analysis showed a narrower total GEARS score distribution by the Crowd compared to experts. APMs compared amongst performance groups for each evaluator showed that through GEARS scoring, the most common differentiated metric by evaluators was the velocity of the dominant instrument arm. The Crowd outperformed two out of four expert evaluators by discriminating differences in three APMs using total GEARS scores. The Crowd assigns a narrower range of GEARS scores compared to experts but maintains overall agreement with experts. The discriminatory ability of the Crowd at discerning differences in robotic movements (via APMs) through GEARS scoring is quite refined, rivaling that of expert evaluators.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call