Evaluation of Bias in Medical Student Clinical Clerkship Evaluations Using Natural Language Processing.

Sonali Bhanvadia,Michelle Daniel,Lina Lander,Bharanidharan Radha Saseendrakumar,Joy Guo,Sally L. Baxter

doi:10.1097/acm.0000000000004807

Abstract

Purpose: There is increasing interest in understanding potential bias in medical education. Prior studies have demonstrated potential disparities in the words used to describe medical students in their evaluations, such as variations by gender/race. 1,2 For example, natural language processing (NLP) has been used to determine narrative differences between medical clerkship evaluations on the basis of gender and under-represented minority status. 2 Females were more commonly described using words for personal attributes, whereas males were more likely to be described using words related to professional competencies. 1 In this study, we used NLP to evaluate potential bias among third-year clinical clerkship evaluations. Method: This study was conducted at the University of California San Diego School of Medicine. Data was extracted from medical evaluation and administrative databases for medical students enrolled in third-year clinical clerkship rotations across 2 academic years (2019–2020 and 2020–2021). For each evaluation, we collected information regarding demographics of both the student and the faculty evaluator. These were used to determine gender and racial concordance (i.e., whether the student and faculty identified in the same group). We extracted the full narrative text of each evaluation, as well as numerical evaluation scores. Narrative text was processed using a Python NLP package, which assigned a sentiment score for each evaluation. Word clouds were generated for various demographic groups to exhibit the most commonly used words. We analyzed the distributions of sentiment scores and clerkship grades to determine any differences by gender or race/ethnicity. We used multinomial logistic regression to model final clerkship grades, using predictors such as numerical evaluation scores, gender and racial concordance between faculty and students, and sentiment scores. Statistical analyses were performed using R. Statistical significance was defined as P < .05. Results: We analyzed 963 evaluations from 198 students, with 109 (55%) females. Ninety-two (47%) identified as Caucasian, 77 (39%) Asian, 13 (7%) African American, and 16 (9%) other/unspecified or declined to state race. Ten (5%) identified as Hispanic. Females (190/534, 34%) were more likely to receive honors than males (113/420, 26%) (P = .02). Significantly more students received honors or near-honors grades in the 2020–2021 school year in comparison with the 2019–2020 school year (P < .00001). Sentiment scores for evaluations did not vary significantly by student gender, race, or ethnicity (P = .88, .64, and .06, respectively). Word choices were similar across faculty and student demographic groups. Similarly, in the multinomial logistic regression model of final clerkship grades, the narrative evaluation sentiment score was not predictive of an honors grade (odds ratio [OR] 1.23, P = .58). However, the numerical evaluation average (OR 1.45, P < .001) and gender concordance between faculty and student (OR 1.32, P = .049) were significant predictors of receiving honors. Discussion: There was no clear evidence of bias in medical student evaluations. No differences were found in sentiment scores or in word choices by gender or race/ethnicity. Of note, sentiment scores from narrative evaluations were not significantly associated with final grades, while numerical evaluation scores were. Narrative feedback tended to be positive for all students regardless of final grade, which may not be constructive to students for understanding how to improve. Significance: The lack of disparities in our study contrasts prior findings from other institutions. Ongoing efforts include comparative analyses with other institutions to understand what institutional factors (e.g., geographic location) may contribute to bias. NLP enables a systematic approach for investigating bias. The insights gained from the lack of association between word choices, sentiment scores, and final grades show potential opportunities to improve feedback processes for students.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluation of Bias in Medical Student Clinical Clerkship Evaluations Using Natural Language Processing.

Abstract

Talk to us

Similar Papers

More From: Academic medicine : journal of the Association of American Medical Colleges

Lead the way for us

Journal: Academic medicine : journal of the Association of American Medical Colleges	Publication Date: Oct 18, 2022
Citations: 1

Similar Papers

Evaluation of bias and gender/racial concordance based on sentiment analysis of narrative evaluations of clinical clerkships using natural language processing
Sonali Bhanvadia ... Sally L Baxter
BMC Medical Education | VOL. 24
Sonali Bhanvadia, et. al.Sonali Bhanvadia ... Sally L Baxter
15 Mar 2024
BMC Medical Education | VOL. 24

New directions in medical student clerkship evaluations
Avery M Whitis ... Colleen Stockdale
Proceedings in Obstetrics and Gynecology | VOL. 9
Avery M Whitis, et. al.Avery M Whitis ... Colleen Stockdale
01 Dec 2019
Proceedings in Obstetrics and Gynecology | VOL. 9

The effect of racial and gender concordance between physicians and patients on the assessment of hospitalist performance: a pilot study
Damian Crawford ... Flora Kisuule
BMC Health Services Research | VOL. 19
Damian Crawford, et. al.Damian Crawford ... Flora Kisuule
24 Apr 2019
BMC Health Services Research | VOL. 19

The Influence of Experimenter Gender and Race on Pain Reporting: Does Racial or Gender Concordance Matter?
Carol S Weisse ... Elizabeth A Fisher
Pain Medicine | VOL. 6
Carol S Weisse, et. al.Carol S Weisse ... Elizabeth A Fisher
01 Jan 2004
Pain Medicine | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of Bias in Medical Student Clinical Clerkship Evaluations Using Natural Language Processing.

Abstract

Talk to us

Similar Papers

More From: Academic medicine : journal of the Association of American Medical Colleges