Abstract

With the recent advancements in the field of semantic segmentation, an encoderdecoder approach like U-Net are most widely used to solve biomedical image segmentation tasks. To improve upon the existing U-Net, we proposed a novel architecture called Multi-Scale Dilated Fusion Network (MSDFNet). In this work, we have used the pre-trained ResNet50 as the encoder, which had already learned features that can be used by the decoder to generate the binary mask. In addition, we used skip-connections to directly facilitate the transfer of features from the encoder to the decoder. Some of these features are lost due to the depth of the network. The decoder consists of a Multi-Scale Dilated Fusion block, as the main components of the decoder, where we fused the multiscale features and then applied some dilated convolution upon them. We have trained both the U-Net and the proposed architecture on the Ksavir-Instrument dataset, where the proposed architecture has a 3.701 % gain in the F1 score and 4.376 % in the Jaccard. These results show the improvement over the existing U-Net model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call