Written comments are gaining traction as robust sources of assessment data. Compared with the structure of numeric scales, what faculty choose to write is ad hoc, leading to idiosyncratic differences in what is recorded. This study offers exploration of what aspects of writing styles are determined by the faculty offering comment and what aspects are determined by the trainee being commented upon. The authors compiled in-training evaluation report comment data, generated from 2012 to 2015 by 4 large North American Internal Medicine training programs. The Linguistic Index and Word Count (LIWC) was used to categorize and quantify the language contained. Generalizability theory was used to determine whether faculty could be reliably discriminated from one another based on writing style. Correlations and ANOVAs were used to determine what styles were related to faculty or trainee demographics. Datasets contained 23-142 faculty who provided 549-2,666 assessments on 161-989 trainees. Faculty could easily be discriminated from one another using a variety of LIWC metrics including word count, words per sentence, and the use of "clout" words. These patterns appeared person specific and did not reflect demographic factors such as gender or rank. These metrics were similarly not consistently associated with trainee factors such as postgraduate year or gender. Faculty seem to have detectable writing styles that are relatively stable across the trainees they assess, which may represent an under-recognized source of construct irrelevance. If written comments are to meaningfully contribute to decision making, we need to understand and account for idiosyncratic writing styles.
Read full abstract