Abstract
Automatically generating medical image reports is a gratifying task. For doctors, it can reduce the heavy burden of writing reports, and for patients, it can reduce the waiting time for reports; it can also avoid misdiagnosis and missed diagnoses caused by human factors. However, this task still faces enormous challenges due to the problem of visual and textual data bias and the complex relationships among the components of medical reports. To this end, in this work, we propose an auxiliary signal guidance and memory-driven (ASGMD) network that can be used to generate medical reports automatically. It includes three modules: an Auxiliary Signal Guidance Module (ASG), a text sequential attention mechanism (TSAM) module, and a Memory Mechanism-Driven Decoding Module (MMDD). Given a medical image of a patient, radiologists usually focus on the abnormal area first, then browse the global information included in the image and write a corresponding report. Similar to the above working mode, the ASG module enhances the features of the abnormal areas of medical images by introducing auxiliary signals that alleviate the problem of visual data bias. We design a novel TSAM module that explores the consistency of medical report context and enhances essential medical information in reports to reduce textual data bias. Finally, the MMDD module integrates visual and textual knowledge to achieve dynamic decoding and generate a final report. The experimental results show that the proposed method outperforms state-of-the-art models on various evaluation metrics on the two public datasets, IU-Xray and MIMIC-CXR. To make our results reproducible, our code has been released at https://github.com/shangchengLu/ASGMDN.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.