Abstract

The UNet architecture, which is widely used for biomedical image segmentation, has limitations like blurred feature maps and over- or under-segmented regions. To overcome these limitations, we propose a novel network architecture called MACCoM (Multiple Attention and Convolutional Cross-Mixer) – an end-to-end depthwise encoder-decoder fully convolutional network designed for binary and multi-class biomedical image segmentation built upon deeperUNet. We proposed a multi-scope attention module (MSAM) that allows the model to attend to diverse scale features, preserving fine details and high-level semantic information thus useful at the encoder-decoder connection. As the depth increases, our proposed spatial multi-head attention (SMA) is added to facilitate inter-layer communication and information exchange, enabling the network to effectively capture long-range dependencies and global context. MACCoM is also equipped with a convolutional cross-mixer we proposed to enhance the feature extraction capability of the model. By incorporating these modules, we effectively combine semantically similar features and reduce artifacts during the early stages of training. Experimental results on 4 biomedical datasets crafted from 3 datasets of varying modalities consistently demonstrate that MACCoM outperforms or matches state-of-the-art baselines in the segmentation tasks. With Breast Ultrasound Image (BUSI), MACCoM recorded 99.06 % Jaccard, 77.58 % Dice, and 93.92 % Accuracy, while recording 99.50 %, 98.44 %, and 99.29 % respectively for Jaccard, Dice, and Accuracy on the Chest X-ray (CXR) images used. The Jaccard, Dice, and Accuracy for the High-Resolution Fundus (HRF) images are 95.77 %, 74.35 %, and 95.95 % respectively. The findings here highlight MACCoM's effectiveness in improving segmentation performance and its valuable potential in image analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call