Lung disease is one of the leading causes of death worldwide. This emphasizes the need for early diagnosis in order to provide appropriate treatment and save lives. Physicians typically require information about patients’ clinical symptoms, various laboratory and pathology tests, along with chest X-rays to confirm the diagnosis of lung disease. In this study, we present a transformer-based multimodal deep learning approach that incorporates imaging and clinical data for effective lung disease diagnosis on a new multimodal medical dataset. The proposed method employs a cross-attention transformer module to merge features from the heterogeneous modalities. Then unified fused features are used for disease classification. The experiments were performed and evaluated on several classification metrics to illustrate the performance of the proposed approach. The study’s results revealed that the proposed method achieved an accuracy of 95% in terms of accurate classification of tuberculosis and outperformed other traditional fusion methods on multimodal tuberculosis data used in this study.
Read full abstract