Abstract

BackgroundA radiology report communicates the imaging findings to the referring clinicians. The rising number of referrals has created a bottleneck in healthcare. Writing a report takes disproportionally more time than the imaging itself. Therefore, Automatic Radiology Report Generation (ARRG) has a great potential to unclog this bottleneck. ObjectivesThis study aims to provide a systematic review of Deep Learning (DL) approaches to ARRG. Specifically, it aims to answer the following research questions. What data have been used to train and evaluate DL approaches to ARRG? How are DL approaches to ARRG evaluated? How is DL used to generate the reports from radiology images? Materials and methodsWe followed the PRISMA guidelines. We retrieved 1443 records from PubMed and Web of Science on November 3, 2021. Relevant studies were categorized and compared from multiple perspectives. The corresponding findings were reported narratively. ResultsA total of 41 studies were included. We identified 14 radiology datasets. In terms of evaluation, we identified four commonly used natural language generation metrics, six clinical efficacy metrics, and other qualitative methods. We compared DL approaches with respect to the underlying neural network architecture, the method of text generation, problem representation, training strategy, interpretability, and intermediate processing. Discussion and conclusionData imbalance (normal versus abnormal cases) and the inner complexity of reports pose major difficulties in ARRG. More appropriate evaluation metrics are required as well as datasets on a much larger scale. Leveraging structured representation of radiology reports and pre-trained language models warrant further research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call