HMA-Net: A deep U-shaped network combined with HarDNet and multi-attention mechanism for medical image segmentation.

Qiaohong Liu,Juan Zhang,Ziling Liu,Ziqi Han

doi:10.1002/mp.16065

Abstract

Automatic segmentation of lesion, organ, and tissue from the medical image is an important part of medical image analysis, which are useful for improving the accuracy of disease diagnosis and clinical analysis. For skin melanomas lesions, the contrast ratio between lesions and surrounding skin is low and there are many irregular shapes, uneven distribution, and local and boundary features. Moreover, some hair covering the lesions destroys the local context. Polyp characteristics such as shape, size, and appearance vary at different development stages. Early polyps with small sizes have no distinctive features and could be easily mistaken for other intestinal structures, such as wrinkles and folds. Imaging positions and illumination conditions would alter polyps' appearance and lead to no visible transitions between polyps and surrounding tissue. It remains a challenging task to accurately segment the skin lesions and polyps due to the high variability in the location, shape, size, color, and texture of the target object. Developing a robust and accurate segmentation method for medical images is necessary. To achieve better segmentation performance while dealing with the difficulties above, a U-shape network based on the encoder and decoder structure is proposed to enhance the segmentation performance in target regions. In this paper, a novel deep network of the encoder-decoder model that combines HarDNet, dual attention (DA), and reverse attention (RA) is proposed. First, HarDNet68 is employed to extract the backbone features while improving the inference speed and computational efficiency. Second, the DA block is adopted to capture the global feature dependency in spatial and channel dimensions, and enrich the contextual information on local features. At last, three RA blocks are exploited to fuse and refine the boundary features to obtain the final segmentation results. Extensive experiments are conducted on a skin lesion dataset which consists of ISIC2016, ISIC2017, and ISIC 2018, and a polyp dataset which consists of several public datasets, that is, Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, Endosece. The proposed method outperforms some state-of-art segmentation models on the ISIC2018, ISIC2017, and ISIC2016 datasets, with Jaccard's indexes of 0.846, 0.881, and 0.894, mean Dice coefficients of 0.907, 0.929, and 03939, precisions of 0.908, 0.977, and 0.968, and accuracies of 0.953, 0.975, and 0.972. Additionally, the proposed method also performs better than some state-of-art segmentation models on the Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, and Endosece datasets, with mean Dice coefficients of 0.907, 0.935, 0.716, 0.667, and 0.887, mean intersection over union coefficients of 0.850, 0.885, 0.644, 0.595, and 0.821, structural similarity measures of 0.918, 0.953, 0.823, 0.807, and 0.933, enhanced alignment measures of 0.952, 0.983, 0.850, 0.817, and 0.957, mean absolute errors of 0.026, 0.007, 0.037, 0.030, and 0.009. The proposed deep network could improve lesion segmentation performance in polyp and skin lesion images. The quantitative and qualitative results show that the proposed method can effectively handle the challenging task of segmentation while revealing the great potential for clinical application.

Full Text