Abstract

Semantic image segmentation has a wide range of applications. When it comes to medical image segmentation, its accuracy is even more important than those of other areas because the performance gives useful information directly applicable to disease diagnosis, surgical planning, and history monitoring. The state-of-the-art models in medical image segmentation are variants of encoder-decoder architecture, which is called U-Net. To effectively reflect the spatial features in feature maps in encoder-decoder architecture, we propose a spatially adaptive weighting scheme for medical image segmentation. Specifically, the spatial feature is estimated from the feature maps, and the learned weighting parameters are obtained from the computed map, since segmentation results are predicted from the feature map through a convolutional layer. Especially in the proposed networks, the convolutional block for extracting the feature map is replaced with the widely used convolutional frameworks: VGG, ResNet, and Bottleneck Resent structures. In addition, a bilinear up-sampling method replaces the up-convolutional layer to increase the resolution of the feature map. For the performance evaluation of the proposed architecture, we used three data sets covering different medical imaging modalities. Experimental results show that the network with the proposed self-spatial adaptive weighting block based on the ResNet framework gave the highest IoU and DICE scores in the three tasks compared to other methods. In particular, the segmentation network combining the proposed self-spatially adaptive block and ResNet framework recorded the highest 3.01% and 2.89% improvements in IoU and DICE scores, respectively, in the Nerve data set. Therefore, we believe that the proposed scheme can be a useful tool for image segmentation tasks based on the encoder-decoder architecture.

Highlights

  • Over the past few years, deep convolutional neural networks have made a lot of progress in computer vision-based tasks, including image classification [1,2], object detection [3,4], semantic segmentation [5,6], human pose estimation [7,8], image captioning [9,10], and so on

  • Considering the goal of segmentation, which assigns a category label to each pixel in the image, the segmentation result is obtained from the last feature map via the convolutional layer, so the feature maps in the encoder-decoder architecture should reflect the spatial characteristics of the task

  • In encoder-decoder architecture, we propose a spatial adaptive weighting method for encoder-decoder architecture to reflect the spatial characteristics of feature maps

Read more

Summary

Introduction

Over the past few years, deep convolutional neural networks have made a lot of progress in computer vision-based tasks, including image classification [1,2], object detection [3,4], semantic segmentation [5,6], human pose estimation [7,8], image captioning [9,10], and so on.Semantic image segmentation has a wide range of applications in the fields of computer vision, robotics, medical, and computer graphics. Image segmentation in natural images is used to parse the scene, and its performance has improved so that it can be applicable to automatic driving and robot sensing, to name a few [6,11]. When it comes to medical image segmentation, accuracy is even more important than other areas because the result gives important information for disease diagnosis, surgical planning, and history monitoring [12]. State-of-the-art scene segmentation frameworks for natural images are based on the fully convolutional network (FCN) [13], and the state-of-the-art models for medical image segmentation are variants of the encoder-decoder architecture called U-Net [14,15]. Considering the goal of segmentation, which assigns a category label to each pixel in the image, the segmentation result is obtained from the last feature map via the convolutional layer, so the feature maps in the encoder-decoder architecture should reflect the spatial characteristics of the task

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call