The Effect of Resnet Model as Feature Extractor Network to Performance of DeepLabV3 Model for Semantic Satellite Image Segmentation

Yaya Heryadi,Herlawati Herlawati,Edy Irwansyah,Haryono Soeparno,Eka Miranda,Kiyota Hashimoto

doi:10.1109/agers51788.2020.9452768

Abstract

Semantic image segmentation is an interesting problem in Computer Vision with many potential applications. The DeepLab model is combined with two other networks: Resnet and Conditional Random Field networks, making the DeepLab model a fairly deep network structure to increase semantic segmentation performance. Many previous studies argued that there are some limits on the deep learning model's depth as the deep structure may lead to vanishing/exploding gradient, which the model's performance. This paper presents an experimental study to compare the effect of several ImageNet pre-trained Resnet variant models with different network layers used as feature extractor in DeepLab model to solve semantic image segmentation task. In this study, three Resnet34, Resnet50, and Resnet101 models as network extractor of DeepLabV3were explored. The experiment found that semantic image segmentation model performance measured by the best accuracy and average accuracies of DeepLabV3- Resnet34, DeepLabV3-Resnet50, and DeepLabV3-Resnet101 are (0.87, 0.86) (0.86, 0.84), and (0.92, 0.88) respectively. Based on the experiment, DeepLabV3-Resnet101 achieved the best semantic segmentation performance than the other models

Full Text