Vocabulary Diversity Research Articles

Accurate medical advice is paramount in ensuring optimal patient care, and misinformation can lead to misguided decisions with potentially detrimental health outcomes. The emergence of large language models (LLMs) such as OpenAI's GPT-4 has spurred interest in their potential health care applications, particularly in automated medical consultation. Yet, rigorous investigations comparing their performance to human experts remain sparse. This study aims to compare the medical accuracy of GPT-4 with human experts in providing medical advice using real-world user-generated queries, with a specific focus on cardiology. It also sought to analyze the performance of GPT-4 and human experts in specific question categories, including drug or medication information and preliminary diagnoses. We collected 251 pairs of cardiology-specific questions from general users and answers from human experts via an internet portal. GPT-4 was tasked with generating responses to the same questions. Three independent cardiologists (SL, JHK, and JJC) evaluated the answers provided by both human experts and GPT-4. Using a computer interface, each evaluator compared the pairs and determined which answer was superior, and they quantitatively measured the clarity and complexity of the questions as well as the accuracy and appropriateness of the responses, applying a 3-tiered grading scale (low, medium, and high). Furthermore, a linguistic analysis was conducted to compare the length and vocabulary diversity of the responses using word count and type-token ratio. GPT-4 and human experts displayed comparable efficacy in medical accuracy ("GPT-4 is better" at 132/251, 52.6% vs "Human expert is better" at 119/251, 47.4%). In accuracy level categorization, humans had more high-accuracy responses than GPT-4 (50/237, 21.1% vs 30/238, 12.6%) but also a greater proportion of low-accuracy responses (11/237, 4.6% vs 1/238, 0.4%; P=.001). GPT-4 responses were generally longer and used a less diverse vocabulary than those of human experts, potentially enhancing their comprehensibility for general users (sentence count: mean 10.9, SD 4.2 vs mean 5.9, SD 3.7; P<.001; type-token ratio: mean 0.69, SD 0.07 vs mean 0.79, SD 0.09; P<.001). Nevertheless, human experts outperformed GPT-4 in specific question categories, notably those related to drug or medication information and preliminary diagnoses. These findings highlight the limitations of GPT-4 in providing advice based on clinical experience. GPT-4 has shown promising potential in automated medical consultation, with comparable medical accuracy to human experts. However, challenges remain particularly in the realm of nuanced clinical judgment. Future improvements in LLMs may require the integration of specific clinical reasoning pathways and regulatory oversight for safe use. Further research is needed to understand the full potential of LLMs across various medical specialties and conditions.

National classifications and terminologies already routinely used for documentation within patient care settings enable the unambiguous representation of clinical information. However, the diversity of different vocabularies across health care institutions and countries is a barrier to achieving semantic interoperability and exchanging data across sites. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) enables the standardization of structure and medical terminology. It allows the mapping of national vocabularies into so-called standard concepts, representing normative expressions for international analyses and research. Within our project "Hybrid Quality Indicators Using Machine Learning Methods" (Hybrid-QI), we aim to harmonize source codes used in German claims data vocabularies that are currently unavailable in the OMOP CDM. This study aims to increase the coverage of German vocabularies in the OMOP CDM. We aim to completely transform the source codes used in German claims data into the OMOP CDM without data loss and make German claims data usable for OMOP CDM-based research. To prepare the missing German vocabularies for the OMOP CDM, we defined a vocabulary preparation approach consisting of the identification of all codes of the corresponding vocabularies, their assembly into machine-readable tables, and the translation of German designations into English. Furthermore, we used 2 proposed approaches for OMOP-compliant vocabulary preparation: the mapping to standard concepts using the Observational Health Data Sciences and Informatics (OHDSI) tool Usagi and the preparation of new 2-billion concepts (ie, concept_id >2 billion). Finally, we evaluated the prepared vocabularies regarding completeness and correctness using synthetic German claims data and calculated the coverage of German claims data vocabularies in the OMOP CDM. Our vocabulary preparation approach was able to map 3 missing German vocabularies to standard concepts and prepare 8 vocabularies as new 2-billion concepts. The completeness evaluation showed that the prepared vocabularies cover 44.3% (3288/7417) of the source codes contained in German claims data. The correctness evaluation revealed that the specified validity periods in the OMOP CDM are compliant for the majority (705,531/706,032, 99.9%) of source codes and associated dates in German claims data. The calculation of the vocabulary coverage showed a noticeable decrease of missing vocabularies from 55% (11/20) to 10% (2/20) due to our preparation approach. By preparing 10 vocabularies, we showed that our approach is applicable to any type of vocabulary used in a source data set. The prepared vocabularies are currently limited to German vocabularies, which can only be used in national OMOP CDM research projects, because the mapping of new 2-billion concepts to standard concepts is missing. To participate in international OHDSI network studies with German claims data, future work is required to map the prepared 2-billion concepts to standard concepts.

Vocabulary Diversity Research Articles

Related Topics

Articles published on Vocabulary Diversity

Names of Russian Oven and Its Components in Arkhangelsk Dialects

Narrative reconstruction in deaf and hearing children: A comparative study in the context of Arabic diglossia

Estimates of speech efficiency in monolingual and bilingual speakers of English.

Assessing GPT-4's Performance in Delivering Medical Advice: Comparative Analysis With Human Experts.

Should We Stop Using Lexical Diversity Measures in Children's Language Sample Analysis?

Speech-to-text intervention to support text production among students with writing difficulties: a single-case study in nordic countries

The Use of Natural Language Processing Elements for Computer-Aided Diagnostics and Monitoring of Body Image Perception in Enterally Fed Patients with Head and Neck or Upper Gastrointestinal Tract Cancers.

Cigarette tasting Chinese text classification for low-resource scenarios

Beyond content: discriminatory power of function words in text type classification

Expanding the Spatial Reach and Human Impacts of Critical Zone Science

Variasi dan Komponen Makna Verba Pewarta pada Korpus Berita Daring

Examining gender effects in autistic written language skills: A small sample exploratory study.

Evaluating the Impact of Text Data Augmentation on Text Classification Tasks using DistilBERT

Lexical diversity as a predictor of complexity in textbooks on the Russian language

The Use of Natural Language Processing for Computer-Aided Diagnostics and Monitoring of Body Image Perception in Patients with Cancers.

Assessing the Use of German Claims Data Vocabularies for Research in the Observational Medical Outcomes Partnership Common Data Model: Development and Evaluation Study.

Parent–toddler play talk: Toddler speech is differentially associated with paternal and maternal speech in interaction

북한 공장대학교 영어교과서 읽기지문에 대한 코퍼스 기반 분석

언어반응촉진 전략을 사용한 보완대체의사소통 중재가 자폐성장애 중학생의 구문능력과 어휘다양도에 미치는 효과

Vocabulary Diversity in Personal Narratives Produced in Response to the Global TALES Protocol in Dutch-Speaking Students with and without Dyslexia

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Vocabulary Diversity Research Articles

Related Topics

Articles published on Vocabulary Diversity

Names of Russian Oven and Its Components in Arkhangelsk Dialects

Narrative reconstruction in deaf and hearing children: A comparative study in the context of Arabic diglossia

Estimates of speech efficiency in monolingual and bilingual speakers of English.

Assessing GPT-4's Performance in Delivering Medical Advice: Comparative Analysis With Human Experts.

Should We Stop Using Lexical Diversity Measures in Children's Language Sample Analysis?

Speech-to-text intervention to support text production among students with writing difficulties: a single-case study in nordic countries

The Use of Natural Language Processing Elements for Computer-Aided Diagnostics and Monitoring of Body Image Perception in Enterally Fed Patients with Head and Neck or Upper Gastrointestinal Tract Cancers.

Cigarette tasting Chinese text classification for low-resource scenarios

Beyond content: discriminatory power of function words in text type classification

Expanding the Spatial Reach and Human Impacts of Critical Zone Science

Variasi dan Komponen Makna Verba Pewarta pada Korpus Berita Daring

Examining gender effects in autistic written language skills: A small sample exploratory study.

Evaluating the Impact of Text Data Augmentation on Text Classification Tasks using DistilBERT

Lexical diversity as a predictor of complexity in textbooks on the Russian language

The Use of Natural Language Processing for Computer-Aided Diagnostics and Monitoring of Body Image Perception in Patients with Cancers.

Assessing the Use of German Claims Data Vocabularies for Research in the Observational Medical Outcomes Partnership Common Data Model: Development and Evaluation Study.

Parent–toddler play talk: Toddler speech is differentially associated with paternal and maternal speech in interaction

북한 공장대학교 영어교과서 읽기지문에 대한 코퍼스 기반 분석

언어반응촉진 전략을 사용한 보완대체의사소통 중재가 자폐성장애 중학생의 구문능력과 어휘다양도에 미치는 효과

Vocabulary Diversity in Personal Narratives Produced in Response to the Global TALES Protocol in Dutch-Speaking Students with and without Dyslexia