Abstract

Abstract. This paper is devoted to the problem of image semantic segmentation for machine vision system of off-road autonomous robotic vehicle. Most modern convolutional neural networks require large computing resources that go beyond the capabilities of many robotic platforms. Therefore, the main drawback of such models is extremely high complexity of the convolutional neural network used, whereas tasks in real applications must be performed on devices with limited resources in real-time. This paper focuses on the practical application of modern lightweight architectures as applied to the task of semantic segmentation on mobile robotic systems. The article discusses backbones based on ResNet18, ResNet34, MobileNetV2, ShuffleNetV2, EfficientNet-B0 and decoders based on U-Net and DeepLabV3 as well as additional components that can increase the accuracy of segmentation and reduce the inference time. In this paper we propose a model using ResNet34 and DeepLabV3 decoding with Squeeze & Excitation blocks that was optimal in terms of inference time and accuracy. We also demonstrate our off-road dataset and simulated dataset for semantic segmentation. Furthermore, we present that using pre-trained weights on simulated dataset achieves to increase 2.7% mIoU on our off-road dataset compared pre-trained weights on the Cityscapes. Moreover, we achieve 75.6% mIoU on the Cityscapes validation set and 85.2% mIoU on our off-road validation set with a speed of 37 FPS for a 1,024×1,024 input on one NVIDIA GeForce RTX 2080 card using NVIDIA TensorRT.

Highlights

  • Reliable and stable semantic model of the surrounding scene, detection of objects and all kinds of obstacles that may appear in the path of an autonomous car is a difficult task for any machine vision system.Object detection is a two-step approach

  • Using the pretrained weights on our simulated dataset, model with backbone of ResNet34 and DeeplabV3 decoding increased 2.7% mIoU compared to the pre-trained weights on the Cityscapes dataset (Table 3)

  • Approaches based on convolution neural networks, have achieved significant success in various computer vision tasks, such as image classification, object detection, and semantic segmentation

Read more

Summary

Introduction

Reliable and stable semantic model of the surrounding scene, detection of objects and all kinds of obstacles that may appear in the path of an autonomous car is a difficult task for any machine vision system. We need to localize the instances of interest in the image, to classify them. Using deep convolutional neural networks, we can build a bounding box for each object in the image. This approach does not convey the exact shape of the object and does not consider the entire context of the image because the bounding boxes are rectangular. Object detection does not provide a complete understanding of the surrounding scene

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.