Medical image segmentation and classification are two of the most key steps in computer-aided clinical diagnosis. The region of interest were usually segmented in a proper manner to extract useful features for further disease classification. However, these methods are computationally complex and time-consuming. In this paper, we proposed a one-stage multi-task attention network (MTANet) which efficiently classifies objects in an image while generating a high-quality segmentation mask for each medical object. A reverse addition attention module was designed in the segmentation task to fusion areas in global map and boundary cues in high-resolution features, and an attention bottleneck module was used in the classification task for image feature and clinical feature fusion. We evaluated the performance of MTANet with CNN-based and transformer-based architectures across three imaging modalities for different tasks: CVC-ClinicDB dataset for polyp segmentation, ISIC-2018 dataset for skin lesion segmentation, and our private ultrasound dataset for liver tumor segmentation and classification. Our proposed model outperformed state-of-the-art models on all three datasets and was superior to all 25 radiologists for liver tumor diagnosis.
Read full abstract