Abstract

AbstractImage segmentation has proven to be beneficial in many applications, including medical imaging, object detection and scene detection. The future of image segmentation is scene prediction based on the previously segmented images. In this paper, author presents a survey comparing image segmentation architectures based on convolutional neural network (CNN). The paper explains the basic layers of CNN, its depth, general representation and the working of CNN. The paper also discusses the challenges faced by the architectures and their probable solutions. This survey highlights the segmentation network (SegNet). SegNet is a technique of encoder-decoder pair for image segmentation and classification. SegNet architecture is built using CNN and Visual Geometry Group (VGG16) network. VGG16 is a network with 16 deep layers and can classify images into 1000 categories. This architecture resolves the problem of high computational time and high requirement of memory. The comparison of fully convolutional network (FCN) and SegNet is also presented at the end. CamVid dataset is used for testing SegNet for scene segmentation. This data set is captured with sensors shown in SUN RGB-D and implemented using PASCAL VOC challenges on road scene segmentation and indoor scene segmentation. The paper tries to explore how the SegNet architecture works efficiently in case of road scenes and faces accuracy issues with indoor scene segmentation.KeywordsCamVidCNNDeconvolutionEncoder-decoder networkFCNImage segmentationIndoor/outdoor scenesMax-poolingObject detectionPASCAL VOCScene recognitionSegNet

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call