Abstract
Background: Lumbar spinal stenosis (LSS) is a major cause of chronic lower back and leg pain, and is traditionally diagnosed through labor-intensive analysis of magnetic resonance imaging (MRI) scans by radiologists. This study aims to streamline the diagnostic process by developing an automated radiology report generation (ARRG) system using a vision-language (VL) model. Methods: We utilized a Generative Image-to-Text (GIT) model, originally designed for visual question answering (VQA) and image captioning. The model was fine-tuned to generate diagnostic reports directly from lumbar spine MRI scans using a modest set of annotated data. Additionally, GPT-4 was used to convert semistructured text into coherent paragraphs for better comprehension by the GIT model. Results: The model effectively generated semantically accurate and grammatically coherent reports. The performance was evaluated using METEOR (0.37), BERTScore (0.886), and ROUGE-L (0.3), indicating its potential to produce clinically relevant content. Conclusions: This study highlights the feasibility of using vision-language models to automate report generation from medical imaging, potentially reducing the diagnostic workload for radiologists.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have