AERMNet: Attention-enhanced relational memory network for medical image report generation

Xianhua Zeng,Tianxing Liao,Liming Xu,Zhiqiang Wang

doi:10.1016/j.cmpb.2023.107979

Abstract

Background and objectivesThe automatic generation of medical image diagnostic reports can assist doctors in reducing their workload and improving the efficiency and accuracy of diagnosis. However, among the most existing report generation models, there are problems that the weak correlation between generated words and the lack of contextual information in the report generation process. MethodsTo address the above problems, we propose an Attention-Enhanced Relational Memory Network (AERMNet) model, where the relational memory module is continuously updated by the words generated in the previous time step to strengthen the correlation between words in generated medical image report. And the double LSTM with interaction module reduces the loss of context information and makes full use of feature information. Thus, more accurate disease information can be generated by AERMNet for medical image reports. ResultsExperimental results on four medical datasets Fetal heart (FH), Ultrasound, IU X-Ray and MIMIC-CXR, show that our proposed method outperforms some of the previous models with respect to language generation metrics (Cider improving by 2.4% on FH, Bleu1 improving by 2.4% on Ultrasound, Cider improving by 16.4% on IU X-Ray, Bleu2 improving by 9.7% on MIMIC-CXR). ConclusionsThis work promotes the development of medical image report generation and expands the prospects of computer-aided diagnosis applications. Our code is released at https://github.com/llttxx/AERMNET.

Full Text