Abstract
Medical report generation is a valuable and challenging task, which automatically generates accurate and fluent diagnostic reports for medical images, reducing workload of radiologists and improving efficiency of disease diagnosis. Fine-grained alignment of medical images and reports facilitates the exploration of close correlations between images and texts, which is crucial for cross-modal generation. However, visual and linguistic biases caused by radiologists' writing styles make cross-modal image-text alignment difficult. To alleviate visual-linguistic bias, this paper discretizes medical reports and introduces an intermediate modality, i.e. phrasebook, consisting of key noun phrases. As discretized representation of medical reports, phrasebook contains both disease-related medical terms, and synonymous phrases representing different writing styles which can identify synonymous sentences, thereby promoting fine-grained alignment between images and reports. In this paper, an augmented two-stage medical report generation model with phrasebook (PhraseAug) is developed, which combines medical images, clinical histories and writing styles to generate diagnostic reports. In the first stage, phrasebook is used to extract semantically relevant important features and predict key phrases contained in the report. In the second stage, medical reports are generated according to the predicted key phrases which contain synonymous phrases, promoting our model to adapt to different writing styles and generating diverse medical reports. Experimental results on two public datasets, IU-Xray and MIMIC-CXR, demonstrate that our proposed PhraseAug outperforms state-of-the-art baselines.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.