Abstract

We study the problem of automatically detecting if a given multi-class classifier operates outside of its specifications (out-of-specs), i.e. on input data from a different distribution than what it was trained for. This is an important problem to solve on the road towards creating reliable computer vision systems for real-world applications, because the quality of a classifier’s predictions cannot be guaranteed if it operates out-of-specs. Previously proposed methods for out-of-specs detection make decisions on the level of single inputs. This, however, is insufficient to achieve low false positive rate and high false negative rates at the same time. In this work, we describe a new procedure named KS(conf), based on statistical reasoning. Its main component is a classical Kolmogorov–Smirnov test that is applied to the set of predicted confidence values for batches of samples. Working with batches instead of single samples allows increasing the true positive rate without negatively affecting the false positive rate, thereby overcoming a crucial limitation of single sample tests. We show by extensive experiments using a variety of convolutional network architectures and datasets that KS(conf) reliably detects out-of-specs situations even under conditions where other tests fail. It furthermore has a number of properties that make it an excellent candidate for practical deployment: it is easy to implement, adds almost no overhead to the system, works with any classifier that outputs confidence scores, and requires no a priori knowledge about how the data distribution could change.

Highlights

  • Over the last years, and in particular with the emergence of deep convolutional networks (ConvNets), computer vision systems have become accurate and reliable enough to perform tasks of practical relevance autonomously and over long periods of time

  • We demonstrate the power of KS(conf) using five state-of-the-art ConvNets architectures (ResNet50, VGG19, SqueezeNet, MobileNet25, NASNetAlarge), challenging real-world image datasets (ImageNet ILSVRC 2012, Animals with Attributes 2, DAVIS) and a variety of possible out-of-specs scenarios

  • After an introduction to the experimental setting and data sources in Sect. 5, we present our experimental evaluation divided into three parts, each of which we consider of potentially independent interest: an analysis of the limits of tests acting on single samples for out-of-specs detection (Sect. 6), an analysis of batch-based methods (Sect. 7), and a study how modern ConvNets react to changes of their inputs acquisition setup

Read more

Summary

Introduction

In particular with the emergence of deep convolutional networks (ConvNets), computer vision systems have become accurate and reliable enough to perform tasks of practical relevance autonomously and over long periods of time. This has opened opportunities for the Communicated by Thomas Brox. If a system works well on a sufficiently large amount of data fulfilling both conditions, practical experience as well as statistical learning theory tell us that it will work well in the future We call this operating within the specifications (within-specs)

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call