There is a need for reliable and practical interprofessional simulations that measure collaborative practice in outpatient/community scenarios where most health care takes place. The authors applied generalizability theory to examine reliability in an ambulatory care scenario using the following 2 trained observer groups: standardized patient (SP, actor) raters and those who received rater training alone (non-SPs). Twenty-one graduate health professions students participated as health care providers in an interprofessional care simulation involving an SP, caregiver, and clinicians. Six observers in each group received frame-of-reference training and rated aspects of collaborative care using a behavioral observation checklist. The authors examined sources of measurement variance using generalizability theory and extended this technique to statistically compare the rater types and compute reliability for subsets of raters. Standardized patient ratings were significantly more reliable than non-SPs' despite both groups receiving extensive rater training. A single SP was predicted to generate scores with a reliability of 0.74, whereas a single non-SP rater's scores were predicted at a reliability of 0.40. Removing each rater one by one from the full 6-member SP sample reduced reliability similarly for all raters (reliability, 0.86-0.89). However, removing individual raters from the full 6-member non-SP sample led to more variable reductions in reliability (0.58-0.72). Ongoing experience rating performance from within a particular simulation-based assessment may be a valuable rater characteristic and more effective than rater training alone. The extensions of reliability estimation introduced here can also be used to support more insightful reliability research and subsequent improvement of rater training and assessment protocols.
Read full abstract