This study was designed to investigate if artificial intelligence (AI) detection software can determine the use of AI in personal statements for residency applications. Previously written personal statements were collected from physicians who had already matched to residency through the Electronic Residency Application System. Physicians were recruited for the study through collegial relationships and were given study information via email. The study team constructed five parallel statements from the shared personal statements to prompt AI to create a personal statement of similar content. An online AI detection tool, GPTZero, was used to assess all the personal statements. Statistical analyses were conducted using R. Descriptive statistics, t-tests, and Pearson correlations were used to assess the data. Eight physicians' statements were compared to eight AI-generated statements. GPTZero was able to correctly identify AI-generated writing, assigning them significantly higher AI probability scores compared to human-authored essays. Human-generated statements were considered more readable, used shorter words with fewer syllables, and had more sentences compared to AI-generated essays. Longer average sentence length, low readability scores, and high SAT word percentages were strongly associated with AI-generated essays. This study shows the capacity of GPTZero to distinguish human-created versus AI-generated writing. Use of AI can pose significant ethical challenges and carries a risk of inadvertent harm to certain applicants and erosion of trust in the application process. Authors suggest standardization of protocol regarding the use of AI prior to its integration in post-graduate medical education.
Read full abstract