Is It Cheating to Use Chat GPT-4 in the Clinical Practice of Psychiatry?

Steven Hyler

doi:10.1176/appi.pn.2023.07.7.27

Abstract

Back to table of contents Previous article Next article ViewpointsFull AccessIs It Cheating to Use Chat GPT-4 in the Clinical Practice of Psychiatry?Steven Hyler, M.D.Steven HylerSearch for more papers by this author, M.D.Published Online:8 Jun 2023https://doi.org/10.1176/appi.pn.2023.07.7.27As an enthusiastic admirer of technological advancements, I have always been fascinated by novel innovations, especially their potential implications in psychiatry. OpenAI's latest offering, Chat GPT-4, intrigued me due to its exceptional ability to provide a wealth of both personal and professional information. However, it's noteworthy to mention that this technology occasionally produces fictitious answers, or as known in AI parlance, "hallucinates," prompting questions about its reliability in patient care.In an endeavor to assess its applicability in psychiatry, my team at Columbia University Medical Center and I embarked on a project to compare Chat GPT-4's responses to a “gold standard” of psychiatric questions and answers readily available online. Our findings indicated that Chat GPT-4 delivered accurate responses about 70% to 80% of the time, a success rate parallel to studies conducted using the MCAT and USMLE. This outcome implies that Chat GPT-4 holds potential as a valuable asset in psychiatric practice.Nevertheless, the specter of “cheating” looms large when contemplating the deployment of AI technology in a professional setting. It's pivotal to distinguish between academic testing—where outside assistance is traditionally prohibited—and real-life medical practice, where physicians might significantly profit from harnessing all accessible resources for their patients' welfare. In 1983, Robert Spitzer, M.D., proposed the LEAD standard, which is an acronym for Longitudinal Evaluation conducted by expert clinicians, leveraging All Data available—family members' inputs, hospital records, psychological evaluations, and laboratory results included. Therefore, it begs the question: Why insist that our trainees rely solely on memory? Why not empower them with all available tools, teaching them how to optimally use these resources during their training?Consider a hypothetical study utilizing exam questions from the PRITE or ABPN. One group is instructed to refrain from using technology while answering the questions, under the threat of severe repercussions. In contrast, the other group is informed of no such limitations. Who would score higher? Which group would eventually morph into better physicians? It might be worth seeking the patients' perspective. Would they prefer their physicians to base their assessments, diagnoses, and treatment plans on memory alone? Or would they favor their physicians using all tech aids at their disposal for better patient care?In the realm of clinical psychiatry, I contend that using Chat GPT-4 should not be stigmatized as "cheating." Rather, it should be perceived as an additional instrument in the ever-growing arsenal of technology available to health care professionals. As physicians, we are tasked with the responsibility of setting boundaries when assimilating AI technology into patient care. It goes without saying that physicians should maintain accountability for their patients' care, not solely relying on AI-generated responses.Endorsing this viewpoint, I argue for the enablement and encouragement of trainees to harness all accessible resources, including Chat GPT-4, during their education. This approach could potentially enhance their clinical acumen and proficiency in integrating technology into patient care. I eagerly anticipate engaging in further discussions on this subject with other professionals who might hold divergent views. ■Steven Hyler, M.D., is professor emeritus of psychiatry at Columbia University Medical Center. He has a longstanding interest in education and technology. ISSUES NewArchived

Full Text