Abstract

Segmentation and recognition become the general steps to identify objects. This research discusses pixel-wise semantic segmentation based on moving objects. The data from the CamVid video which is a collection of autonomous driving images. The image data consist of 701 images accompanied by labels. The segmentation and recognition of 11 objects contained in the image (sky, building, pole, road, pavement, tree, sign-symbol, fence, car, pedestrian and bicyclist) is representing. This moving object segmentation is carried out using SegNet which is one of the Convolutional Neural Network (CNN) methods. Image segmentation on CNN generally consists of two parts: Encoder and Decoder. VGG16 and VGG19 pre-trained networks are used as encoders, while decoders are the upsampling of encoders. Network optimization uses stochastic gradient descent of Momentum (SGDM). The test produces the best recognition was road objects with an accuracy of 0.96013, IoU 0.93745, F1-Score 0.8535 using VGG19 encoder, while when using VGG16 encoder accuracy was 0.94162, IoU 0.92309, and F1-Score 0.8535.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.