Abstract

Deep neural networks (DNNs) have surpassed human-level accuracy in various learning tasks. However, unlike humans who have a natural cognitive intuition for probabilities, DNNs cannot express their uncertainty in the output decisions. This limits the deployment of DNNs in mission-critical domains, such as warfighter decision-making or medical diagnosis. Bayesian inference provides a principled approach to reason about model's uncertainty by estimating the posterior distribution of the unknown parameters. The challenge in DNNs remains the multi-layer stages of non-linearities, which make the propagation of high-dimensional distributions mathematically intractable. This paper establishes the theoretical and algorithmic foundations of uncertainty or belief propagation by developing new deep learning models named PremiUm-CNNs (Propagating Uncertainty in Convolutional Neural Networks). We introduce a tensor normal distribution as a prior over convolutional kernels and estimate the variational posterior by maximizing the evidence lower bound (ELBO). We start by deriving the first-order mean-covariance propagation framework. Later, we develop a framework based on the unscented transformation (correct at least up to the second-order) that propagates sigma points of the variational distribution through layers of a CNN. The propagated covariance of the predictive distribution captures uncertainty in the output decision. Comprehensive experiments conducted on diverse benchmark datasets demonstrate: 1) superior robustness against noise and adversarial attacks, 2) self-assessment through predictive uncertainty that increases quickly with increasing levels of noise or attacks, and 3) an ability to detect a targeted attack from ambient noise.

Highlights

  • D EEP neural networks (DNNs) have achieved state-of-theart performance in a wide assortment of tasks, including computer vision and pattern recognition [1], [2]

  • Another earlier effort included Hamiltonian Monte Carlo (HMC), i.e., a method based on Markov chain Monte Carlo (MCMC), for generating samples from the posterior distribution, which suffered from computational challenges [25]

  • We propose a framework for Propagating Uncertainty in Convolution Neural Networks, PremiUm-convolutional neural network (CNN), which enables the estimation of uncertainty at the output decision

Read more

Summary

Introduction

D EEP neural networks (DNNs) have achieved state-of-theart performance in a wide assortment of tasks, including computer vision and pattern recognition [1], [2]. The mean of the posterior is given by the maximum a posteriori estimate and the covariance by the inverse of the Hessian of the negative log-likelihood, which is intractable for DNNs. Ritter et al recently proposed a scalable approximation for estimating the Hessian; the Laplace approximation was employed at the test time only, i.e., the training was performed in a deterministic setting without learning uncertainty from the training dataset [16]. Ritter et al recently proposed a scalable approximation for estimating the Hessian; the Laplace approximation was employed at the test time only, i.e., the training was performed in a deterministic setting without learning uncertainty from the training dataset [16] Another earlier effort included Hamiltonian Monte Carlo (HMC), i.e., a method based on Markov chain Monte Carlo (MCMC), for generating samples from the posterior distribution, which suffered from computational challenges [25]. The ADF approximation proposed by Hernandez-Lobato and Adams eliminated the dependence on ordering by doing multiple ADF passes over the data; the full EP implementation was impractical for DNNs due to massive computational and memory requirements [29]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call