Abstract

The latest iteration of GPT4 (generative pretrained transformer) is a large multimodal model that can integrate both text and image input, but its performance with medical images has not been systematically evaluated. We studied whether ChatGPT with GPT-4V(ision) can recognize images from common nuclear medicine examinations and interpret them. Fifteen representative images (scintigraphy, 11; PET, 4) were submitted to ChatGPT with GPT-4V(ision), both in its Default and "Advanced Data Analysis (beta)" version. ChatGPT was asked to name the type of examination and tracer, explain the findings and whether there are abnormalities. ChatGPT should also mark anatomical structures or pathological findings. The appropriateness of the responses was rated by 3 nuclear medicine physicians. The Default version identified the examination and the tracer correctly in the majority of the 15 cases (60% or 53%) and gave an "appropriate" description of the findings or abnormalities in 47% or 33% of cases, respectively. The Default version cannot manipulate images. "Advanced Data Analysis (beta)" failed in all tasks in >90% of cases. A "major" or "incompatible" inconsistency between 3 trials of the same prompt was observed in 73% (Default version) or 87% of cases ("Advanced Data Analysis (beta)" version). Although GPT-4V(ision) demonstrates preliminary capabilities in analyzing nuclear medicine images, it exhibits significant limitations, particularly in its reliability (ie, correctness, predictability, and consistency).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call