DUAL PYRAMIDS ENCODER-DECODER NETWORK FOR SEMANTIC SEGMENTATION IN GROUND AND AERIAL VIEW IMAGES

S L Jiang,G Li,W Yao,T Y Kuc,Z H Hong

doi:10.5194/isprs-archives-xliii-b2-2020-605-2020

Abstract

Abstract. Semantic segmentation is a fundamental research task in computer vision, which intends to assign a certain category to every pixel. Currently, most existing methods only utilize the deepest feature map for decoding, while high-level features get inevitably lost during the procedure of down-sampling. In the decoder section, transposed convolution or bilinear interpolation was widely used to restore the size of the encoded feature map; however, few optimizations are applied during up-sampling process which is detrimental to the performance for grouping and classification. In this work, we proposed a dual pyramids encoder-decoder deep neural network (DPEDNet) to tackle the above issues. The first pyramid integrated and encoded multi-resolution features through sequentially stacked merging, and the second pyramid decoded the features through dense atrous convolution with chained up-sampling. Without post-processing and multi-scale testing, the proposed network has achieved state-of-the-art performances on two challenging benchmark image datasets for both ground and aerial view scenes.

Highlights

Semantic image segmentation is a dense classification task for image understanding, which has many practical applications such as autonomous driving and augmented reality devices
Following the common procedure of semantic segmentation, we reported the precision, recall and mean Intersection over Union (IoU)
On figure 2, the visualization images show that our proposed DPEDNet enables to accurately detect and segment the objects in various scales, complicated scene and very challenging illuminate situation

Summary

Introduction

Semantic image segmentation is a dense classification task for image understanding, which has many practical applications such as autonomous driving and augmented reality devices. FCN-based architectures (Ronneberger et al, 2015; Badrinarayanan et al, 2017; Treml et al, 2016; Jiang et al, 2019; Jiang et al, 2020) utilized several pooling layers to extract high-level features and restored the extracted feature map to original resolution through transposed convolution. Atrous convolution (Holschneideret al., 1990) with various dilation rates are utilized to extract features in parallel This kind of pyramid structure is effective in multi-scale feature extraction and can enhance the ability to classify and group ambiguous objects, it only captures contextual information from the deepest feature map by conducting a context module after the encoding stage. We hold the view that the contextual information in early and middle stages can be further extracted to enhance feature extraction

Results

Discussion

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences	Publication Date: Aug 12, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

DUAL PYRAMIDS ENCODER-DECODER NETWORK FOR SEMANTIC SEGMENTATION IN GROUND AND AERIAL VIEW IMAGES

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences

Lead the way for us

Similar Papers

Model distillation for high-level semantic understanding：a survey
Sun Ruoyu ... Xiong Hongkai
Journal of Image and Graphics | VOL. 28
Sun Ruoyu, et. al.Sun Ruoyu ... Xiong Hongkai
01 Jan 2023
Journal of Image and Graphics | VOL. 28

A novel weight initialization with adaptive hyper-parameters for deep semantic segmentation
Nuhman Ul Haq ... Ling Shao
Multimedia Tools and Applications | VOL. 80
Nuhman Ul Haq, et. al.Nuhman Ul Haq ... Ling Shao
20 Mar 2021
Multimedia Tools and Applications | VOL. 80

Semantic Image Segmentation with Deep Convolutional Neural Networks and Quick Shift
Sanxing Zhang ... Rui Zhang
Symmetry | VOL. 12
Sanxing Zhang, et. al.Sanxing Zhang ... Rui Zhang
06 Mar 2020
Symmetry | VOL. 12

2D Semantic Segmentation: Recent Developments and Future Directions
Yu Guo ... Guigen Nie
Future internet | VOL. 15
Yu Guo, et. al.Yu Guo ... Guigen Nie
01 Jun 2023
Future internet | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DUAL PYRAMIDS ENCODER-DECODER NETWORK FOR SEMANTIC SEGMENTATION IN GROUND AND AERIAL VIEW IMAGES

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences