US Radiology Research Articles

BackgroundThe impression section integrates key findings of a radiology report but can be subjective and variable. We sought to fine-tune and evaluate an open-source Large Language Model (LLM) in automatically generating impressions from the remainder of a radiology report across different imaging modalities and hospitals.MethodsIn this institutional review board-approved retrospective study, we collated a dataset of CT, US, and MRI radiology reports from the University of California San Francisco Medical Center (UCSFMC) (n = 372,716) and the Zuckerberg San Francisco General (ZSFG) Hospital and Trauma Center (n = 60,049), both under a single institution. The Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score, an automatic natural language evaluation metric that measures word overlap, was used for automatic natural language evaluation. A reader study with five cardiothoracic radiologists was performed to more strictly evaluate the model’s performance on a specific modality (CT chest exams) with a radiologist subspecialist baseline. We stratified the results of the reader performance study based on the diagnosis category and the original impression length to gauge case complexity.ResultsThe LLM achieved ROUGE-L scores of 46.51, 44.2, and 50.96 on UCSFMC and upon external validation, ROUGE-L scores of 40.74, 37.89, and 24.61 on ZSFG across the CT, US, and MRI modalities respectively, implying a substantial degree of overlap between the model-generated impressions and impressions written by the subspecialist attending radiologists, but with a degree of degradation upon external validation. In our reader study, the model-generated impressions achieved overall mean scores of 3.56/4, 3.92/4, 3.37/4, 18.29 s,12.32 words, and 84 while the original impression written by a subspecialist radiologist achieved overall mean scores of 3.75/4, 3.87/4, 3.54/4, 12.2 s, 5.74 words, and 89 for clinical accuracy, grammatical accuracy, stylistic quality, edit time, edit distance, and ROUGE-L score respectively. The LLM achieved the highest clinical accuracy ratings for acute/emergent findings and on shorter impressions.ConclusionsAn open-source fine-tuned LLM can generate impressions to a satisfactory level of clinical accuracy, grammatical accuracy, and stylistic quality. Our reader performance study demonstrates the potential of large language models in drafting radiology report impressions that can aid in streamlining radiologists’ workflows.

Read full abstract

This study is the first multi-center non-inferiority study that aims to critically evaluate the effectiveness of HHUS/ABUS in China breast cancer detection. This was a multicenter hospital-based study. Five hospitals participated in this study. Women (30–69 years old) with defined criteria were invited for breast examination by HHUS, ABUS or/and mammography. For BI-RADS category 3, an additional magnetic resonance imaging (MRI) test was provided to distinguish the true negative results from false negative results. For women classified as BI-RADS category 4 or 5, either core aspiration biopsy or surgical biopsy was done to confirm the diagnosis. Between February 2016 and March 2017, 2844 women signed the informed consent form, and 1947 of them involved in final analysis (680 were 30 to 39 years old, 1267 were 40 to 69 years old).For all participants, ABUS sensitivity (91.81%) compared with HHUS sensitivity (94.70%) with non-inferior Z tests, P = 0.015. In the 40–69 age group, non-inferior Z tests showed that ABUS sensitivity (93.01%) was non-inferior to MG sensitivity (86.02%) with P < 0.001 and HHUS sensitivity (95.44%) was non-inferior to MG sensitivity (86.02%) with P < 0.001. Sensitivity of ABUS and HHUS are all superior to that of MG with P < 0.001 by superior test.For all participants, ABUS specificity (92.89%) was non-inferior to HHUS specificity (89.36%) with P < 0.001. Superiority test show that specificity of ABUS was superior to that of HHUS with P < 0.001. In the 40–69 age group, ABUS specificity (92.86%) was non-inferior to MG specificity (91.68%) with P < 0.001 and HHUS specificity (89.55%) was non-inferior to MG specificity (91.68%) with P < 0.001. ABUS is not superior to MG with P = 0.114 by superior test. The sensitivity of ABUS/HHUS is superior to that of MG. The specificity of ABUS/HHUS is non-inferior to that of MG. In China, for an experienced US radiologist, both HHUS and ABUS have better diagnostic efficacy than MG in symptomatic individuals.

Read full abstract

US Radiology Research Articles

Related Topics

Articles published on US Radiology

An open-source fine-tuned large language model for radiological impression generation: a multi-reader performance study

Status of LGBTQ+ Inclusion: Multi-Institution Assessment of US Radiology Residencies

Diversity, Equity, Inclusion in US Radiology: Current Status and Legislative Trends

Discrimination faced by radiology residents: an analysis of experiences and mitigation strategies

US Radiology Resident Perceptions of Current Well-Being Programming: A Case Study

The Influence of Extracurricular Activities on Radiology Resident Selection Decisions

Venture Capital in US Medicine: A Briefing for Radiologists

The Effect of the COVID-19 Pandemic on Academic Research Gender Disparities in Radiology

Family and Medical Leave Utilization in US Radiology Practices

Sociodemographic Variables Reporting in Human Radiology Artificial Intelligence Research

Leadership: Causing and Curing Burnout in Radiology

Artificial Intelligence/Machine Learning Education in Radiology: Multi-institutional Survey of Radiology Residents in the United States

Assessment of US Radiology Residency Program Websites in the COVID-19 Era.

Radiology Practices Employing Nurse Practitioners and Physician Assistants: Characteristics and Trends From 2017 Through 2019.

The Radiology Resident Education Research Alliance: The Evolution of a Multi-Institutional Research Cooperative.

Geographic Trends in Publications and Submissions in Radiology Journals: Decade Report (2010 – 2020)

Chest-Related Imaging Investigations During Multiple Waves of COVID-19 Infection in Hong Kong.

Factors Influential in the Selection of Radiology Residents in the Post-Step 1 World: A Discrete Choice Experiment.

A multicenter, hospital-based and non-inferiority study for diagnostic efficacy of automated whole breast ultrasound for breast cancer\xa0in China

Resident Clinician Educator Leadership Pathway Tracks in US Radiology Programs: An ADVICER 2021 Survey Study

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

US Radiology Research Articles

Related Topics

Articles published on US Radiology

An open-source fine-tuned large language model for radiological impression generation: a multi-reader performance study

Status of LGBTQ+ Inclusion: Multi-Institution Assessment of US Radiology Residencies

Diversity, Equity, Inclusion in US Radiology: Current Status and Legislative Trends

Discrimination faced by radiology residents: an analysis of experiences and mitigation strategies

US Radiology Resident Perceptions of Current Well-Being Programming: A Case Study

The Influence of Extracurricular Activities on Radiology Resident Selection Decisions

Venture Capital in US Medicine: A Briefing for Radiologists

The Effect of the COVID-19 Pandemic on Academic Research Gender Disparities in Radiology

Family and Medical Leave Utilization in US Radiology Practices

Sociodemographic Variables Reporting in Human Radiology Artificial Intelligence Research

Leadership: Causing and Curing Burnout in Radiology

Artificial Intelligence/Machine Learning Education in Radiology: Multi-institutional Survey of Radiology Residents in the United States

Assessment of US Radiology Residency Program Websites in the COVID-19 Era.

Radiology Practices Employing Nurse Practitioners and Physician Assistants: Characteristics and Trends From 2017 Through 2019.

The Radiology Resident Education Research Alliance: The Evolution of a Multi-Institutional Research Cooperative.

Geographic Trends in Publications and Submissions in Radiology Journals: Decade Report (2010 – 2020)

Chest-Related Imaging Investigations During Multiple Waves of COVID-19 Infection in Hong Kong.

Factors Influential in the Selection of Radiology Residents in the Post-Step 1 World: A Discrete Choice Experiment.

A multicenter, hospital-based and non-inferiority study for diagnostic efficacy of automated whole breast ultrasound for breast cancer\xa0in China

Resident Clinician Educator Leadership Pathway Tracks in US Radiology Programs: An ADVICER 2021 Survey Study