Background. Admission decisions require that information about an applicant be combined using either holistic (human judges) or statistical (actuarial) methods. For optimizing a defined measureable outcome, there is a consistent body of research evidence demonstrating that statistical methods yield superior decisions compared to those generated by judges. It is possible, however, that the benefits of holistic decisions are reflected in unmeasured outcomes. If such benefits exist, they would necessarily appear as systematic variance in raters’ scores that deviate from statistically-based decisions. Purpose. To estimate this variance, we propose a design examining the interrater reliability of difference scores (i.e., the difference between observed committee rankings and rankings based on statistical approaches). Methods. Example calculations and G study models are presented to demonstrate how rater agreement on difference scores can be analyzed under various circumstances. High interrater reliability of difference scores would support but not prove the assertion that the holistic process adds useful information beyond that achieved by much less costly statistical approaches. Conversely, if the interrater reliability of difference scores is near zero, this would clearly demonstrate that committee judgments add random error to the decision process. Results. Evidence to conduct such studies already exists within most highly selective medical schools and graduate programs and the proposed validity research could be conducted on existing data. Conclusions. Such research evidence is critical for establishing the validity of widely used holistic admission approaches.
Read full abstract