Abstract

Background: Increasing application numbers constrain residency programs from conducting a holistic review of applicants. Pilot programs to improve holistic review are underway aimed at appraising applicants rapidly. Despite this, assessing narrative components of residency applications remains a bottleneck during applicant review, with many programs resorting to filters based on academic metrics and attributes to invite applicants that are known to have racial, gender, and other inequities. To address this gap and automate the process, the authors used natural language processing (NLP) to predict an interview invite based solely on their narrative experiences. Methods: Narrative experience entries were extracted from 6500 residency applications across three application cycles (2017-2019) at one academic internal medicine program and paired with the interview invitation decision made by program directors. NLP was used to process the experiences’ entries; the importance of each word or phrase (called “tokens”) was calculated using the term frequency-inverse document frequency (TF-IDF). A logistic regression model was developed to determine words or phrases that predicted an interview invitation. Models were also built using structured application data such as USMLE scores and graduation year – either alone or combined with the NLP analysis. Model performance was evaluated on never-before-seen data using the area under the receiver operating characteristic and precision-recall curves (AUROC and AUPRC, respectively). The most influential words or phrases impacting interview invitation were examined. Results: The NLP-only model had an AUROC of 0.79 and AUPRC of 0.48; phrases such as “student-run”, “pre-clinical clerkships”, “analyze data”, and “social determinants of health” increased the probability of receiving an interview invitation. Adding structured data significantly improved prediction, with an AUROC of 0.93 and AUPRC of 0.74. The variable representing the predicted probability based on NLP of the narrative experiences had the highest coefficient magnitude in this combined model. This emphasized the importance of experiences during the resident selection process. Conclusion: To our knowledge, this is the first report on applying NLP to residency application materials to predict an interview invitation. NLP may provide a promising approach for a rapid holistic review and could advance the aims of equity and diversity in selection by expanding currently used metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call