KS(conf): A Light-Weight Test if a Multiclass Classifier Operates Outside of Its Specifications

Rémy Sun,Christoph H Lampert

doi:10.1007/s11263-019-01232-x

Abstract

We study the problem of automatically detecting if a given multi-class classifier operates outside of its specifications (out-of-specs), i.e. on input data from a different distribution than what it was trained for. This is an important problem to solve on the road towards creating reliable computer vision systems for real-world applications, because the quality of a classifier’s predictions cannot be guaranteed if it operates out-of-specs. Previously proposed methods for out-of-specs detection make decisions on the level of single inputs. This, however, is insufficient to achieve low false positive rate and high false negative rates at the same time. In this work, we describe a new procedure named KS(conf), based on statistical reasoning. Its main component is a classical Kolmogorov–Smirnov test that is applied to the set of predicted confidence values for batches of samples. Working with batches instead of single samples allows increasing the true positive rate without negatively affecting the false positive rate, thereby overcoming a crucial limitation of single sample tests. We show by extensive experiments using a variety of convolutional network architectures and datasets that KS(conf) reliably detects out-of-specs situations even under conditions where other tests fail. It furthermore has a number of properties that make it an excellent candidate for practical deployment: it is easy to implement, adds almost no overhead to the system, works with any classifier that outputs confidence scores, and requires no a priori knowledge about how the data distribution could change.

Highlights

Over the last years, and in particular with the emergence of deep convolutional networks (ConvNets), computer vision systems have become accurate and reliable enough to perform tasks of practical relevance autonomously and over long periods of time
We demonstrate the power of KS(conf) using five state-of-the-art ConvNets architectures (ResNet50, VGG19, SqueezeNet, MobileNet25, NASNetAlarge), challenging real-world image datasets (ImageNet ILSVRC 2012, Animals with Attributes 2, DAVIS) and a variety of possible out-of-specs scenarios
After an introduction to the experimental setting and data sources in Sect. 5, we present our experimental evaluation divided into three parts, each of which we consider of potentially independent interest: an analysis of the limits of tests acting on single samples for out-of-specs detection (Sect. 6), an analysis of batch-based methods (Sect. 7), and a study how modern ConvNets react to changes of their inputs acquisition setup

Summary

Introduction

In particular with the emergence of deep convolutional networks (ConvNets), computer vision systems have become accurate and reliable enough to perform tasks of practical relevance autonomously and over long periods of time. This has opened opportunities for the Communicated by Thomas Brox. If a system works well on a sufficiently large amount of data fulfilling both conditions, practical experience as well as statistical learning theory tell us that it will work well in the future We call this operating within the specifications (within-specs)

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Computer Vision	Publication Date: Oct 10, 2019
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

KS(conf): A Light-Weight Test if a Multiclass Classifier Operates Outside of Its Specifications

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computer Vision

Lead the way for us

Similar Papers

Research on outlier detection in CTD conductivity data based on cubic spline fitting
Long Yu ... Baohua Zhang
Frontiers in Marine Science | VOL. 9
Long Yu, et. al.Long Yu ... Baohua Zhang
01 Nov 2022
Frontiers in Marine Science | VOL. 9

Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model
Shadi Aljawarneh ... Muneer Bani Yassein
Journal of Computational Science | VOL. 25
Shadi Aljawarneh, et. al.Shadi Aljawarneh ... Muneer Bani Yassein
22 Mar 2017
Journal of Computational Science | VOL. 25

PO-03-056 QUANTIFYING THE FREQUENCY AND ACCURACY OF DEVICE-INDICATED ATRIAL FIBRILLATION EPISODES ACROSS DEVICE MANUFACTURERS
Gery F Tomassoni ... Johan D Aasbo
Heart rhythm | VOL. 20
Gery F Tomassoni, et. al.Gery F Tomassoni ... Johan D Aasbo
01 May 2023
Heart rhythm | VOL. 20

Global Voting Model for Protein Function Prediction from Protein-Protein Interaction Networks
Yi Fang ... Mengtian Sun
-
Yi Fang, et. al.Yi Fang ... Mengtian Sun
01 Jan 2014
01 Jan 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

KS(conf): A Light-Weight Test if a Multiclass Classifier Operates Outside of Its Specifications

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computer Vision