Abstract

Producing a better segmentation mask is crucial in scene understanding. Semantic Segmentation is a vital task for applications such as autonomous driving, robotics, medical image understanding. Efficient high and low-level context manipulation is a key for competent pixel-level classification. The image’s high-level feature map helps in the better spatial configuration of the objects for Segmentation, while the low-level features help to discern the boundaries of the objects in the segmentation map. In our implementation, We use a two bridged network. The first bridge manipulates the subtle differences between images and produces a vector to understand the low-level features in the input images. The second bridge produces global contextual aggregation from the image while gathering a better understanding of the image’s high-level features. The backbone network is the dialated residual network which helps to avoid the attrition of the size of the image during feature extraction. We train our network on the Cityscapes dataset and ADE20k dataset and compare our results with the State-of-the-Art models. The initial experiments have yielded an initial mean IoU of 70.1% and pixel accuracy of 94.4% on the cityscapes dataset and 34.6% on the ADE20K dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.