Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study.

Zelin Wu,Wenyi Gan,Zhaowen Xue,Zhengxin Ni,Xiaofei Zheng,Yiyi Zhang

doi:10.2196/52746

Abstract

The creation of large language models (LLMs) such as ChatGPT is an important step in the development of artificial intelligence, which shows great potential in medical education due to its powerful language understanding and generative capabilities. The purpose of this study was to quantitatively evaluate and comprehensively analyze ChatGPT's performance in handling questions for the National Nursing Licensure Examination (NNLE) in China and the United States, including the National Council Licensure Examination for Registered Nurses (NCLEX-RN) and the NNLE. This study aims to examine how well LLMs respond to the NCLEX-RN and the NNLE multiple-choice questions (MCQs) in various language inputs. To evaluate whether LLMs can be used as multilingual learning assistance for nursing, and to assess whether they possess a repository of professional knowledge applicable to clinical nursing practice. First, we compiled 150 NCLEX-RN Practical MCQs, 240 NNLE Theoretical MCQs, and 240 NNLE Practical MCQs. Then, the translation function of ChatGPT 3.5 was used to translate NCLEX-RN questions from English to Chinese and NNLE questions from Chinese to English. Finally, the original version and the translated version of the MCQs were inputted into ChatGPT 4.0, ChatGPT 3.5, and Google Bard. Different LLMs were compared according to the accuracy rate, and the differences between different language inputs were compared. The accuracy rates of ChatGPT 4.0 for NCLEX-RN practical questions and Chinese-translated NCLEX-RN practical questions were 88.7% (133/150) and 79.3% (119/150), respectively. Despite the statistical significance of the difference (P=.03), the correct rate was generally satisfactory. Around 71.9% (169/235) of NNLE Theoretical MCQs and 69.1% (161/233) of NNLE Practical MCQs were correctly answered by ChatGPT 4.0. The accuracy of ChatGPT 4.0 in processing NNLE Theoretical MCQs and NNLE Practical MCQs translated into English was 71.5% (168/235; P=.92) and 67.8% (158/233; P=.77), respectively, and there was no statistically significant difference between the results of text input in different languages. ChatGPT 3.5 (NCLEX-RN P=.003, NNLE Theoretical P<.001, NNLE Practical P=.12) and Google Bard (NCLEX-RN P<.001, NNLE Theoretical P<.001, NNLE Practical P<.001) had lower accuracy rates for nursing-related MCQs than ChatGPT 4.0 in English input. English accuracy was higher when compared with ChatGPT 3.5's Chinese input, and the difference was statistically significant (NCLEX-RN P=.02, NNLE Practical P=.02). Whether submitted in Chinese or English, the MCQs from the NCLEX-RN and NNLE demonstrated that ChatGPT 4.0 had the highest number of unique correct responses and the lowest number of unique incorrect responses among the 3 LLMs. This study, focusing on 618 nursing MCQs including NCLEX-RN and NNLE exams, found that ChatGPT 4.0 outperformed ChatGPT 3.5 and Google Bard in accuracy. It excelled in processing English and Chinese inputs, underscoring its potential as a valuable tool in nursing education and clinical decision-making.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study.

Abstract

Talk to us

Similar Papers

More From: JMIR medical education

Lead the way for us

Journal: JMIR medical education	Publication Date: Oct 3, 2024
License type: cc-by

Similar Papers

Identifying Indicators of National Council Licensure Examination for Registered Nurses (NCLEX-RN) Success in Nursing Graduates in Newfoundland & Labrador.
April D Pike ... Kathy Watkins
International Journal of Nursing Education Scholarship | VOL. 16
April D Pike, et. al.April D Pike ... Kathy Watkins
25 Feb 2019
International Journal of Nursing Education Scholarship | VOL. 16

Academic Predictors of Success on the NCLEX-RN Examination for Associate Degree Nursing Students
Cecile A Lengacher ... Rosemary Keller
Journal of Nursing Education | VOL. 29
Cecile A Lengacher, et. al.Cecile A Lengacher ... Rosemary Keller
01 Apr 1990
Journal of Nursing Education | VOL. 29

Predictors of NCLEX-RN success in a baccalaureate nursing program as a foundation for remediation.
Linda K Daley ... Bonnie L Kirkpatrick
Journal of Nursing Education | VOL. 42
Linda K Daley, et. al.Linda K Daley ... Bonnie L Kirkpatrick
01 Sep 2003
Journal of Nursing Education | VOL. 42

Adapting NCLEX-RN remediation during the COVID-19 pandemic
Sheryl K House ... Stacie Sweet
Teaching and Learning in Nursing | VOL. 17
Sheryl K House, et. al.Sheryl K House ... Stacie Sweet
25 Nov 2021
Teaching and Learning in Nursing | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study.

Abstract

Talk to us

Similar Papers

More From: JMIR medical education