Reinforced Transformer for Medical Image Captioning

Yuxuan Xiong,Bo Du,Pingkun Yan

doi:10.1007/978-3-030-32692-0_77

Abstract

Computerized medical image report generation is of great significance in automating the workflow of medical diagnosis and treatment for reducing health disparities. However, this task presents several challenges, where the generated medical image report should be precise, coherent and contain heterogeneous information. Current deep learning based medical image captioning models rely on recurrent neural networks and only extract top-down visual features, which make them slow and prone to generate incoherent and hard to comprehend reports. To tackle this challenging problem, this paper proposes a hierarchical Transformer based medical imaging report generation model. Our proposed model consists of two parts: (1) An Image Encoder extracts heuristic visual features by a bottom-up attention mechanism; (2) a non-recurrent Captioning Decoder improves the computational efficiency by parallel computation. The former identifies regions of interest via a bottom-up attention module and extracts top-down visual features. Then the Transformer based captioning decoder generates a coherent paragraph of medical imaging report. The proposed model is trained by using a self-critical reinforcement learning method. We evaluate the proposed model on publicly available datasets of IU X-ray. The experiment results show that our proposed model has improved the performance in BLEU-1 by more than 50% compared with other state-of-the-art image captioning methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reinforced Transformer for Medical Image Captioning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Research on automatic generation of multimodal medical image reports based on memory driven
Junze Fang ... Zihan Ju
Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi | VOL. 41
Junze Fang, et. al.Junze Fang ... Zihan Ju
25 Feb 2024
Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi | VOL. 41

ICIPEMIR: Improving the Completeness, Interoperability and Patient Explanations of Medical Imaging Reports.
Arthur Lauriot Dit Prevost ... Guillaume Bouzille
Studies in health technology and informatics | VOL. 281
Arthur Lauriot Dit Prevost, et. al.Arthur Lauriot Dit Prevost ... Guillaume Bouzille
27 May 2021
Studies in health technology and informatics | VOL. 281

A label information fused medical image report generation framework
Shuifa Sun ... Yirong Wu
Artificial Intelligence In Medicine | VOL. 150
Shuifa Sun, et. al.Shuifa Sun ... Yirong Wu
22 Feb 2024
Artificial Intelligence In Medicine | VOL. 150

Hierarchical medical image report adversarial generation with hybrid discriminator
Junsan Zhang ... Mengxuan Liu
Artificial intelligence in medicine | VOL. 151
Junsan Zhang, et. al.Junsan Zhang ... Mengxuan Liu
21 Mar 2024
Artificial intelligence in medicine | VOL. 151

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reinforced Transformer for Medical Image Captioning

Abstract

Talk to us

Similar Papers