Efficacy of large language models and their potential in Obstetrics and Gynecology education.

Kyung Jin Eoh,Gu Yeun Kwon,Eun Jin Lee,Joonho Lee,Inha Lee,Yong Tae Kim,Eun Ji Nam

doi:10.5468/ogs.24211

Abstract

The performance of large language models (LLMs) and their potential utility in obstetric and gynecological education are topics of ongoing debate. This study aimed to contribute to this discussion by examining the recent advancements in LLM technology and their transformative potential in artificial intelligence. This study assessed the performance of generative pre-trained transformer (GPT)-3.5 and -4 in understanding clinical information, as well as its potential implications for obstetric and gynecological education. Obstetrics and gynecology residents at three hospitals underwent an annual promotional examination, from which 116 of the 170 questions over 4 years (2020-2023) were analyzed, excluding 54 questions with images. The scores achieved by GPT-3.5, -4, and the 100 residents were compared. The average scores across all 4 years for GPT-3.5 and -4 were 38.79 (standard deviation [SD], 5.65) and 79.31 (SD, 3.67), respectively. For groups R1, R2, and R3, the cumulative annual average scores were 79.12 (SD, 9.00), 80.95 (SD, 5.86), and 83.60 (SD, 6.82), respectively. No statistically significant differences were observed between the scores of GPT-4.0 and those of the residents. When analyzing questions specific to obstetrics, the average scores for GPT-3.5 and -4.0 were 33.44 (SD, 10.18) and 90.22 (SD, 7.68), respectively. GPT-4 demonstrated exceptional performance in obstetrics, different types of data interpretation, and problem solving, showcasing the potential utility of LLMs in these areas. However, acknowledging the constraints of LLMs is crucial and their utilization should augment human expertise and discernment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficacy of large language models and their potential in Obstetrics and Gynecology education.

Abstract

Talk to us

Similar Papers

More From: Obstetrics & gynecology science

Lead the way for us

Journal: Obstetrics & gynecology science	Publication Date: Oct 2, 2024
License type: cc-by-nc

Similar Papers

Evaluating the Performance of Large Language Models in Hematopoietic Stem Cell Transplantation Decision Making
Ivan Civettini ... Paola Perfetti
Blood | VOL. 142
Ivan Civettini, et. al.Ivan Civettini ... Paola Perfetti
02 Nov 2023
Blood | VOL. 142

How Can IJDS Authors, Reviewers, and Editors Use (and Misuse) Generative AI?
Galit Shmueli ... Bianca Maria Colosimo
INFORMS Journal on Data Science | VOL. 2
Galit Shmueli, et. al.Galit Shmueli ... Bianca Maria Colosimo
01 Apr 2023
INFORMS Journal on Data Science | VOL. 2

E-185 Customized generative pretrained transformer for simplified patient education of carotid angioplasty and stenting: a feasibility study
A Brake ... E Samaniego
Journal of NeuroInterventional Surgery | VOL. 16
A Brake, et. al.A Brake ... E Samaniego
01 Jul 2024
Journal of NeuroInterventional Surgery | VOL. 16

Artificial intelligence chatbots and large language models in dental education: Worldwide survey of educators.
Sergio E Uribe ... Margherita Fontana
European journal of dental education : official journal of the Association for Dental Education in Europe | VOL. 28
Sergio E Uribe, et. al.Sergio E Uribe ... Margherita Fontana
08 Apr 2024
European journal of dental education : official journal of the Association for Dental Education in Europe | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficacy of large language models and their potential in Obstetrics and Gynecology education.

Abstract

Talk to us

Similar Papers

More From: Obstetrics & gynecology science