Abstract

With the commoditization of machine learning, more and more off-the-shelf models are available as part of code libraries or cloud services. Typically, data scientists and other users apply these models as “black boxes”within larger projects. In the case of regressing a scalar quantity, such APIs typically offer a predict() function, which outputs the estimated target variable (often referred to as ŷ or, in code, y_hat). However, many real-world problems may require some sort of deviation interval or uncertainty score rather than a single point-wise estimate. In other words, a mechanism is needed with which to answer the question “How confident is the system about that prediction?” Motivated by the lack of this characteristic in most predictive APIs designed for regression purposes, we propose a method that adds an uncertainty score to every black-box prediction. Since the underlying model is not accessible, and therefore standard Bayesian approaches are not applicable, we adopt an empirical approach and fit an uncertainty model using a labelled dataset (x, y) and the outputs ŷ of the black box. In order to be able to use any predictive system as a black box and adapt to its complex behaviours, we propose three variants of an uncertainty model based on deep networks. The first adds a heteroscedastic noise component to the black-box output, the second predicts the residuals of the black box, and the third performs quantile regression using deep networks. Experiments using real financial data that contain an in-production black-box system and two public datasets (energy forecasting and biology responses) illustrate and quantify how uncertainty scores can be added to black-box outputs.

Highlights

  • The success of machine learning in the real-world problems has led to the increasing commoditization of machine learning systems

  • It has become more commonplace to use a predictive system as a black box inside a larger engineering system [45]

  • Our goal is to construct a model ψ(z) that provides a quantitative estimate of the uncertainty score for the predicted value y, given the preserved or distorted inputs z

Read more

Summary

Introduction

The success of machine learning in the real-world problems has led to the increasing commoditization of machine learning systems. More and more predictive models are available ‘‘off-the-shelf’’ as part of code libraries [35], [38], machine learning servers [10], cloud-based services [1] or inside domain-specific black-box software [39]. Machine learning has become increasingly more accessible to users and developers who are not specialists in this field, but who wish to consume its predictive functionality. It has become more commonplace to use a predictive system as a black box inside a larger engineering system [45]. We consider regression models implemented as black boxes. The term black box denotes a predictive model

Objectives
Methods
Results
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call