Abstract

Modern deep learning models achieve state-of-the-art results for many tasks in computer vision, such as image classification and segmentation. However, its adoption into high-risk applications, e.g. automated medical diagnosis systems, happens at a slow pace. One of the main reasons for this is that regular neural networks do not capture uncertainty. To assess uncertainty in classification, several techniques have been proposed casting neural network approaches in a Bayesian setting. Amongst these techniques, Monte Carlo dropout is by far the most popular. This particular technique estimates the moments of the output distribution through sampling with different dropout masks. The output uncertainty of a neural network is then approximated as the sample variance. In this paper, we highlight the limitations of such a variance-based uncertainty metric and propose an novel approach. Our approach is based on the overlap between output distributions of different classes. We show that our technique leads to a better approximation of the inter-class output confusion. We illustrate the advantages of our method using benchmark datasets. In addition, we apply our metric to skin lesion classification—a real-world use case—and show that this yields promising results.

Highlights

  • In the field of computer vision, deep learning has time and again set new state-of-the-art benchmarks for a variety of tasks, such as large-scale image classification [17, 43, 45], object localization [13, 14, 38], and semantic segmentation [29, 39]

  • We have only described our BC uncertainty metric in relation to Monte Carlo variational inference (MCVI), it is applicable to all approaches, regardless of how the output distribution is obtained

  • We presented a novel metric for quantifying output uncertainty in stochastic neural networks, based on the Bhattacharyya coefficient, aptly named BC uncertainty

Read more

Summary

Introduction

In the field of computer vision, deep learning has time and again set new state-of-the-art benchmarks for a variety of tasks, such as large-scale image classification [17, 43, 45], object localization [13, 14, 38], and semantic segmentation [29, 39]. It is known that neural networks only output a point estimate of the true underlying predictive distribution. 2.1 Bayesian neural networks and variational inference. Consider a neural network as a probabilistic model p(y|x, w). For an input x 2 Rd, the network calculates a probability for each of the K possible outputs y 2 Y, using the weights w. Y is the set of classes, and p(y|x, w) is a categorical distribution. Given a dataset D 1⁄4 fxi; yigni1⁄41, the optimal weights wH for the network can be learned by maximum likelihood estimation (MLE): wH 1⁄4 arg maxw log pðDjwÞ ð1Þ Xn. 1⁄4 arg maxw log pðyijxi; wÞ: ð2Þ i1⁄41.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call