Abstract

To evaluate the variability of subjective tutor performance improvement (Pi) assessment and to compare it with a novel measurement algorithm: the Pi score. The Pi-score algorithm considers time measurement and number of errors from two different repetitions (first and fifth) of the same training task and compares them to the relative task goals, to produce an objective score. We collected data during eight courses on the four European Association of Urology training in Basic Laparoscopic Urological Skills (E-BLUS) tasks. The same tutor instructed on all courses. Collected data were independently analysed by 14 hands-on training experts for Pi assessment. Their subjective Pi assessments were compared for inter-rater reliability. The average per-participant subjective scores from all 14 proctors were then compared with the objective Pi-score algorithm results. Cohen's κ statistic was used for comparison analysis. A total of 50 participants were enrolled. Concordance found between the 14 proctors' scores was the following: Task 1, κ = 0.42 (moderate); Task 2, κ = 0.27 (fair); Task 3, κ = 0.32 (fair); and Task 4, κ = 0.55 (moderate). Concordance between Pi-score results and proctor average scores per participant was the following: Task 1, κ = 0.85 (almost perfect); Task 2, κ = 0.46 (moderate); Task 3, κ = 0.92 (almost perfect); Task 4 = 0.65 (substantial). The present study shows that evaluation of Pi is highly variable, even when formulated by a cohort of experts. Our algorithm successfully provided an objective score that was equal to the average Pi assessment of a cohort of experts, in relation to a small amount of training attempts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call