Abstract

The BLUS (Basic Laparoscopic Urologic Skills) consortium sought to address the construct validity of BLUStasks and the wider problem of accurate, scalable and affordable skillevaluation by investigating the concordance of 2 novel candidate methods with faculty panel scores, those of automated motion metrics and crowdsourcing. A faculty panel of surgeons (5) and anonymous crowdworkers blindly reviewed a randomized sequence of a representative sample of 24 videos (12 pegboard and 12 suturing) extracted from the BLUS validation study (454) using the GOALS (Global Objective Assessment of Laparoscopic Skills) survey tool with appended pass-fail anchors via the same web based user interface. Pre-recorded motion metrics (tool path length, jerk cost etc) were available for each video. Cronbach's alpha,Pearson's R and ROC with AUC statistics were used to evaluate concordance between continuous scores, and as pass-fail criteria among the 3 groups of faculty, crowds andmotion metrics. Crowdworkers provided 1,840 ratings in approximately 48 hours, 60 times faster than the faculty panel. The inter-rater reliability of mean expert and crowd ratings was good (α=0.826). Crowd score derived pass-fail resulted in 96.9% AUC (95% CI 90.3-100; positive predictive value 100%, negative predictive value 89%). Motion metrics and crowd scores provided similar or nearly identical concordance with faculty panel ratings and pass-fail decisions. The concordance of crowdsourcing with faculty panels and speed of reviews is sufficiently highto merit its further investigation alongside automated motion metrics. The overall agreement among faculty, motion metrics and crowdworkers provides evidence in support of the construct validity for 2 of the 4BLUS tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call