Abstract

Road detection technology plays an essential role in a variety of applications, such as urban planning, map updating, traffic monitoring and automatic vehicle navigation. Recently, there has been much development in detecting roads in high-resolution (HR) satellite images based on semantic segmentation. However, the objects being segmented in such images are of small size, and not all the information in the images is equally important when making a decision. This paper proposes a novel approach to road detection based on semantic segmentation and edge detection. Our approach aims to combine these two techniques to improve road detection, and it produces sharp-pixel segmentation maps, using the segmented masks to generate road edges. In addition, some well-known architectures, such as SegNet, used multi-scale features without refinement; thus, using attention blocks in the encoder to predict fine segmentation masks resulted in finer edges. A combination of weighted cross-entropy loss and the focal Tversky loss as the loss function is also used to deal with the highly imbalanced dataset. We conducted various experiments on two datasets describing real-world datasets covering the three largest regions in Saudi Arabia and Massachusetts. The results demonstrated that the proposed method of encoding HR feature maps effectively predicts sharp segmentation masks to facilitate accurate edge detection, even against a harsh and complicated background.

Highlights

  • Introduction published maps and institutional affilThe rapid advancement of high-resolution (HR) satellite images has increased the need to process and extract valuable data

  • In natural language processing (NLP) tasks, the self-attention mechanism is famous for establishing a word with every other word in the sentence, creating a sense of how important each word is compared to every other word

  • This paper proposes a technique to encode features to predict fine segmentation masks and road edges using the predicted masks

Read more

Summary

Introduction

Introduction published maps and institutional affilThe rapid advancement of high-resolution (HR) satellite images has increased the need to process and extract valuable data. Employing the road segmentation approach is challenging because of the associated background complexity, noise, shadows and occlusions Techniques such as template matching or predefined features are high speed but offer low levels of accuracy. In natural language processing (NLP) tasks, the self-attention mechanism is famous for establishing a word with every other word in the sentence, creating a sense of how important each word is compared to every other word. This technique has been extremely popular in NLP tasks and has recently been used in computer vision and surpassing traditional convolutional neural networks

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call