IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models

Zhihao Chen,Bin Hu,Chuang Niu,Tao Chen,Yuxin Li,Hongming Shan,Ge Wang

doi:10.1186/s42492-024-00171-w

Abstract

Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest as a natural language interface across many domains. Recently, large vision-language models (VLMs) that learn rich vision–language correlation from image–text pairs, like BLIP-2 and GPT-4, have been intensively investigated. However, despite these developments, the application of LLMs and VLMs in image quality assessment (IQA), particularly in medical imaging, remains unexplored. This is valuable for objective performance evaluation and potential supplement or even replacement of radiologists’ opinions. To this end, this study introduces IQAGPT, an innovative computed tomography (CT) IQA system that integrates image-quality captioning VLM with ChatGPT to generate quality scores and textual reports. First, a CT-IQA dataset comprising 1,000 CT slices with diverse quality levels is professionally annotated and compiled for training and evaluation. To better leverage the capabilities of LLMs, the annotated quality scores are converted into semantically rich text descriptions using a prompt template. Second, the image-quality captioning VLM is fine-tuned on the CT-IQA dataset to generate quality descriptions. The captioning model fuses image and text features through cross-modal attention. Third, based on the quality descriptions, users verbally request ChatGPT to rate image-quality scores or produce radiological quality reports. Results demonstrate the feasibility of assessing image quality using LLMs. The proposed IQAGPT outperformed GPT-4 and CLIP-IQA, as well as multitask classification and regression models that solely rely on images.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models

Abstract

Talk to us

Similar Papers

More From: Visual Computing for Industry, Biomedicine, and Art

Lead the way for us

Journal: Visual Computing for Industry, Biomedicine, and Art	Publication Date: Aug 5, 2024
License type: CC BY 4.0

Similar Papers

Issue Based OCR Error Prediction in Video Streams
Dirk Siegmund ... Arjan Kuijper
-
Dirk Siegmund, et. al.Dirk Siegmund ... Arjan Kuijper
23 Sep 2020
23 Sep 2020

C-arm Cone-beam CT: General Principles and Technical Considerations for Use in Interventional Radiology
Robert C Orth ... Michael D Kuo
Journal of Vascular and Interventional Radiology | VOL. 19
Robert C Orth, et. al.Robert C Orth ... Michael D Kuo
23 Apr 2008
Journal of Vascular and Interventional Radiology | VOL. 19

The role of large language models in medical image processing: a narrative review.
Dianzhe Tian ... Yiyao Xu
Quantitative imaging in medicine and surgery | VOL. 14
Dianzhe Tian, et. al.Dianzhe Tian ... Yiyao Xu
01 Jan 2024
Quantitative imaging in medicine and surgery | VOL. 14

PENILAIAN KUALITI IMEJ DIGITAL BERDASARKAN KAEDAH CIRI-CIRI SISTEM PENGLIHATAN MANUSIA DAN PRINSIP STRUKTUR IMEJ
Bahbibi Rahmatullah ... Siti Tasnim Mahamud
Jurnal Teknologi | VOL. 78
Bahbibi Rahmatullah, et. al.Bahbibi Rahmatullah ... Siti Tasnim Mahamud
30 May 2016
Jurnal Teknologi | VOL. 78

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models

Abstract

Talk to us

Similar Papers

More From: Visual Computing for Industry, Biomedicine, and Art