Abstract

Online performance prediction (or: observation) of deep neural networks (DNNs) in highly automated driving presents an unsolved task until now, as most DNNs are evaluated offline requiring datasets with ground truth labels. In practice, however, DNN performance depends on the used camera type, lighting and weather conditions, and on various other kinds of domain shift. Also, the input to DNN-based perception systems can be perturbed by adversarial attacks requiring means to detect these input perturbations. In this work we propose a method to mitigate these problems by a multi-task learning approach with monocular depth estimation as a secondary task, which enables us to predict the DNN's performance for various other (primary) tasks by evaluating only the depth estimation task with a physical depth measurement provided, e.g., by a LiDAR sensor. We show the effectiveness of our method for the primary task of semantic segmentation using various training datasets, test datasets, model architectures, and input perturbations. Our method provides an effective way to predict (observe) the performance of DNNs for semantic segmentation even on a single-image basis and is transferable to other primary DNN-based perception tasks in a straightforward manner.

Highlights

  • T OWARDS the development of systems for higher levels of automated driving (3-5, [1]) it is crucial that one can reliably detect failure of such systems or its components

  • By experimental validation, we show that our method is applicable independently of the used training datasets, test dataset, model architecture, or even input perturbations, which degrade the performance of a perception deep neural networks (DNNs)

  • As in principle one could choose any of those metrics for the scope of this work, we chose the accuracy metric ACC, which revealed the highest correlation with the mean intersection over union (mIoU) metric in preliminary experiments and is best suited to observe the performance of the primary task of semantic segmentation in an online setting

Read more

Summary

INTRODUCTION

T OWARDS the development of systems for higher levels of automated driving (3-5, [1]) it is crucial that one can reliably detect failure of such systems or its components. We show that with an additional DNN-based secondary task depth estimation, the redundancy with a LiDAR sensor can be used as a reliable online evaluation scheme Note that such redundancy could potentially, serve as a basis for a generic online performance prediction of any image-based perception task, which we demonstrate using the primary task of semantic segmentation. Our proposed solution offers a way to affiliate the performance of the primary task of semantic segmentation to the performance of the secondary task of depth estimation whose performance can be observed during online operation To this end, we show that neural networks trained in a multi-task fashion (predicting several outputs at once) exhibit a high correlation between the qualities of their outputs measured by the task-specific metrics (cf Fig. 2, orange parts).

RELATED WORK
Depth Estimation
Multi-Task Learning
Performance Prediction of Neural Networks
Original Contribution
THEORETICAL BACKGROUND
Semantic Segmentation
Self-Supervised Monocular Depth Estimation
Network Input Perturbations
Performance Evaluation Metrics
METHOD DESCRIPTION
Training Setup
Novel Online Performance Predictor in an Offline Test Setup
EXPERIMENTAL EVALUATION
Performance Metrics
Experimental Setup
First Analysis and Regression Calibration
Generalization Across Input Perturbations
Generalization Across Different Training Procedures
Generalization Across Different Network Architectures
Generalization Across Different Datasets
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.