Abstract

Image segmentation is an important research in image processing and machine vision in which automated driving can be seen the main application scene of image segmentation algorithms. Due to the many constraints of power supply and communication in in-vehicle systems, the vast majority of current image segmentation algorithms are implemented based on the deep learning model. Despite the ultrahigh segmentation accuracy, the problem of mesh artifacts and segmentation being too severe is obvious, and the high cost, computational, and power consumption devices required are difficult to apply in real-world scenarios. It is the focus of this paper to construct a road scene segmentation model with simple structure and no need of large computing power under the premise of certain accuracy. In this paper, the ESPNet (Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation) model is introduced in detail. On this basis, an improved ESPNet model is proposed based on ESPNet. Firstly, the network structure of the ESPNet model is optimized, and then, the model is optimized by using a small amount of weakly labeled and unlabeled scene sample data. Finally, the new model is applied to video image segmentation based on dash cam. It is verified on Cityscape, PASCAL VOC 2012, and other datasets that the algorithm proposed in this paper is faster, and the amount of parameters required is less than 1% of other algorithms, so it is suitable for mobile terminals.

Highlights

  • In recent years, CNN (Convolutional Neural Network) has made great progress in tasks such as image classification and object detection. e most important first step in these tasks is to predict the classification of each pixel in an image, and by segmenting the original image, researchers hopefully achieved accurate identification of what part of the image each pixel belongs to

  • Since most data of the PASCAL VOC 2012 dataset is not obtained from the camera loaded on the vehicle, it is moderately effective when using the proposed model in this paper for identification. e data and results of this experiment are shown in Table 2 and Figure 9, respectively. e upper part of the results is the original image, and the lower part is the segmentation result

  • The data segmentation results are missing compared to the Cityscapes Dataset, the overall object segmentation is basically correct

Read more

Summary

Introduction

CNN (Convolutional Neural Network) has made great progress in tasks such as image classification and object detection. e most important first step in these tasks is to predict the classification of each pixel in an image, and by segmenting the original image, researchers hopefully achieved accurate identification of what part of the image each pixel belongs to. E most important first step in these tasks is to predict the classification of each pixel in an image, and by segmenting the original image, researchers hopefully achieved accurate identification of what part of the image each pixel belongs to. It is very critical as the first step in computer vision applications. To address the problem of overfitting of small samples, the DenseNet (Densely Connected Convolutional Network) model of FCN can achieve the required accuracy without prior training and reduce the number of parameter to 1/10 of the original model, which has a broad application prospect in tasks such as automatic driving, medical images, and satellite images.

Preliminaries
Model Evaluation Criteria
Related Works
Improvements Based on the ESPNet Model
Experimental Classification Results and Analysis
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.