Abstract

An encoder-decoder convolutional neural network architecture is presented integrating multi-class semantic segmentation and multi-object detection to improve the efficiency and depth of scene parsing of intelligent vehicle. The encoder of the network is designed as a multi-scale structure to improve real-time performance while ensuring the accuracy. The decoders of the network comprise the semantic segmentation and object detection subnetworks, which share encoder feature maps to improve computational efficiency. During the training process, we use FPS (Frames Per Second) and MIoU (Mean Intersection over Union) as the evaluation metrics of semantic segmentation, while the mAP (mean Average Precision) and FPS are used as the performance evaluation indexes of object detection. We conduct separate and joint training on the network to evaluate its performance. Experimental results show that the proposed network can realize multi-class semantic segmentation and multi-object detection simultaneously with better real-time performance and richer feature information, making it highly possible for implementation on real vehicles.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.