Abstract

A major responsibility of radiologists in routine clinical practice is to read follow-up chest radiographs (CXRs) to identify changes in a patient's condition. Diagnosing meaningful changes in follow-up CXRs is challenging because radiologists must differentiate disease changes from natural or benign variations. Here, we suggest using a multi-task Siamese convolutional vision transformer (MuSiC-ViT) with an anatomy-matching module (AMM) to mimic the radiologist's cognitive process for differentiating baseline change from no-change. MuSiC-ViT uses the convolutional neural networks (CNNs) meet vision transformers model that combines CNN and transformer architecture. It has three major components: a Siamese network architecture, an AMM, and multi-task learning. Because the input is a pair of CXRs, a Siamese network was adopted for the encoder. The AMM is an attention module that focuses on related regions in the CXR pairs. To mimic a radiologist's cognitive process, MuSiC-ViT was trained using multi-task learning, normal/abnormal and change/no-change classification, and anatomy-matching. Among 406K CXRs studied, 88K change and 115K no-change pairs were acquired for the training dataset. The internal validation dataset consisted of 1,620 pairs. To demonstrate the robustness of MuSiC-ViT, we verified the results with two other validation datasets. MuSiC-ViT respectively achieved accuracies and area under the receiver operating characteristic curves of 0.728 and 0.797 on the internal validation dataset, 0.614 and 0.784 on the first external validation dataset, and 0.745 and 0.858 on a second temporally separated validation dataset. All code is available at https://github.com/chokyungjin/MuSiC-ViT.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call