Abstract

Medical report generation generates the corresponding report according to the given radiology image, which has been attracting increasing research interest. However, existing methods mainly adopt supervised training which rely on large amount of medical reports that are actually unavailable owing to the labor-intensive labeling process and privacy protection protocol. In the meanwhile, the intrinsic relationships between local pathological changes in the image are often ignored, which actually are important hints to high quality report generation. To this end, we propose a Relation-Aware Mean Teacher (RAMT) framework, which follows a standard mean teacher paradigm for semi-supervised report generation. The key to the encoder of the backbone network is the Graph-guided Hybrid Feature Encoding (GHFE) module, which exploits a prior disease knowledge graph to encode the intrinsic relations between pathological changes into the graph embedding and learns a word dictionary to retrieve the semantic embedding for each potential pathological change. GHFE combines the graph embedding, semantic embedding and visual features to form hybrid features, which are sent to a Transformer-based decoder for report generation. Extensive experiments on the MIMIC-CXR and IU X-Ray datasets demonstrate the effectiveness of our proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call