ObjectiveTo investigate the level of understanding and trust of medical students towards ChatGPT-like large language models, as well as their utilization and attitudes towards these models.MethodsData collection was concentrated from December 2023 to mid-January 2024, utilizing a self-designed questionnaire to assess the use of large language models among undergraduate medical students at Anhui Medical University. The normality of the data was confirmed with Shapiro-Wilk tests. We used Chi-square tests for comparisons of categorical variables, Mann-Whitney U tests for comparisons of ordinal variables and non-normal continuous variables between two groups, Kruskall-Wallis H tests for comparisons of ordinal variables between multiple groups, and Bonferroni tests for post hoc comparisons.ResultsA total of 1774 questionnaires were distributed and 1718 valid questionnaires were collected, with an effective rate of 96.84%. Among these students, 34.5% had heard and used large language models. There were statistically significant differences in the understanding of large language models between genders (p < 0.001), grade levels (junior-level students and senior-level students) (p = 0.03), and major (p < 0.001). Male, junior-level students, and public health management had a higher level of understanding of these models. Genders and majors had statistically significant effects on the degree of trust in large language models (p = 0.004; p = 0.02). Male and nursing students exhibited a higher degree of trust in large language models. As for usage, Male and junior-level students showed a significantly higher proportion of using these models for assisted learning (p < 0.001). Neutral sentiments were held by over two-thirds of the students (66.7%) regarding large language models, with only 51(3.0%) expressing pessimism. There were significant gender-based disparities in attitudes towards large language models, and male exhibited a more optimistic attitude towards these models (p < 0.001). Notably, among students with different levels of knowledge and trust in large language models, statistically significant differences were observed in their perceptions of the shortcomings and benefits of these models.ConclusionOur study identified gender, grade levels, and major as influential factors in students’ understanding and utilization of large language models. This also suggested the feasibility of integrating large language models with traditional medical education to further enhance teaching effectiveness in the future.
Read full abstract