Background and Objectives: Despite CTAs being critical for preoperative planning in autologous breast reconstruction, experienced plastic surgeons may have differing preferences for which side of the abdomen to use for unilateral breast reconstruction. Large language models (LLMs) have the potential to assist medical imaging interpretation. This study compares the perforator selection preferences of experienced plastic surgeons with four popular LLMs based on CTA images for breast reconstruction. Materials and Methods: Six experienced plastic surgeons from Australia, the US, Italy, Denmark, and Argentina reviewed ten CTA images, indicated their preferred side of the abdomen for unilateral breast reconstruction and recommended the type of autologous reconstruction. The LLMs were prompted to do the same. The average decisions were calculated, recorded in suitable tables, and compared. Results: The six consultants predominantly recommend the DIEP procedure (83%). This suggests experienced surgeons feel more comfortable raising DIEP than TRAM flaps, which they recommended only 3% of the time. They also favoured MS TRAM and SIEA less frequently (11% and 2%, respectively). Three LLMs-ChatGPT-4o, ChatGPT-4, and Bing CoPilot-exclusively recommended DIEP (100%), while Claude suggested DIEP 90% and MS TRAM 10%. Despite minor variations in side recommendations, consultants and AI models clearly preferred DIEP. Conclusions: Consultants and LLMs consistently preferred DIEP procedures, indicating strong confidence among experienced surgeons, though LLMs occasionally deviated in recommendations, highlighting limitations in their image interpretation capabilities. This emphasises the need for ongoing refinement of AI-assisted decision support systems to ensure they align more closely with expert clinical judgment and enhance their reliability in clinical practice.
Read full abstract