Medical imaging is critical for clinical diagnosis. In the past few years, medical imaging data has been steadily growing at an annual rate of 30%, while the annual growth rate in the number of radiologists is only 4. 1%. The traditional manual methods of chest imaging reports result in high information load and workload with low relevance of images and texts. Here we propose a chest CT imaging diagnosis report generation network to effectively integrate image and text feature information, improve the relevance of the cross-modal networks and automatically generate imaging diagnosis reports. The network includes three major modules: a novel ResNetII for information extraction, an image–text dual-channel transmodal memory network (DCTMN), and a dual-channel decoder module combined with LSTM gate. We validate the effectiveness of the proposed network on the IU X-ray and the MIMIC-CXR datasets. Our analysis indicates that the proposed method can strengthen the relationship between chest imaging and text information, ultimately achieving the automatic generation of chest imaging diagnosis reports.
Read full abstract