Abstract

Generally, most approaches using methods such as cropping, rotating, and flipping achieve more data to train models for improving the accuracy of detection and segmentation. However, due to the difficulties of labeling such data especially semantic segmentation data, those traditional data augmentation methodologies cannot help a lot when the training set is really limited. In this paper, a model named OFA-Net (One For All Network) is proposed to combine object detection and semantic segmentation tasks. Meanwhile, using a strategy called “1-N Alternation” to train the OFA-Net model, which can make a fusion of features from detection and segmentation data. The results show that object detection data can be recruited to better the segmentation accuracy performance, and furthermore, segmentation data assist a lot to enhance the confidence of predictions for object detection. Finally, the OFA-Net model is trained without traditional data augmentation methodologies and tested on the KITTI test server. The model works well on the KITTI Road Segmentation challenge and can do a good job on the object detection task.

Highlights

  • In recent years, convolutional networks (ConvNets) contributed a lot to the dramatic improvements in computer vision-related tasks

  • This paper proposes a model called One for All Network (OFA-Net) (One For All, which means One model For All results required) to do driving environment images road segmentation and object detection tasks

  • This paper shows that by mixing object detection data with segmentation data using our “1-N Alternation” strategy, this unified multi-task learning [29] model can be trained faster, more accurate, with better generalization ability for the road segmentation task and high prediction confidence for the object detection task

Read more

Summary

Introduction

Convolutional networks (ConvNets) contributed a lot to the dramatic improvements in computer vision-related tasks. Zeiler and Fergus [27] demonstrated that the features learned by ConvNets are hierarchical, while the bottom layers focus on low-level features like corners, edges, etc., the top layers pay more attention to high-level features Inspired by this idea, this paper proposes a model called OFA-Net (One For All, which means One model For All results required) to do driving environment images road segmentation and object detection tasks. The model consists of three parts serving as feature extractor, detection, and segmentation, respectively It feeds object detection and semantic segmentation data alternately, and uses two different loss functions to train each task, respectively. Strategy are speeding up the convergence, improving segmentation accuracy and enhancing prediction confidence for object detection

Related Work
Transfer Learning and Multi-Task Learning
Simultaneous Detection and Segmentation
Initialization
Loss Functions and Loss Value Balancing
Alternate Training Strategy
Dataset Split and Experiments
Hyper Parameters
How Does Detection Affect Segmentation?
How Does Segmentation affect Detection?
OFA-Net Results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call