Abstract

Existing neural stochastic differential equation models, such as SDE-Net, can quantify the uncertainties of deep neural networks (DNNs) from a dynamical system perspective. SDE-Net is either dominated by its drift net with in-distribution (ID) data to achieve good predictive accuracy, or dominated by its diffusion net with out-of-distribution (OOD) data to generate high diffusion for characterizing model uncertainty. However, it does not consider the general situation in a wider field, such as ID data with noise or high missing rates in practice. In order to effectively deal with noisy ID data for credible uncertainty estimation, we propose a vNPs-SDE model, which firstly applies variants of neural processes (NPs) to deal with the noisy ID data, following which the completed ID data can be processed more effectively by SDE-Net. Experimental results show that the proposed vNPs-SDE model can be implemented with convolutional conditional neural processes (ConvCNPs), which have the property of translation equivariance, and can effectively handle the ID data with missing rates for one-dimensional (1D) regression and two-dimensional (2D) image classification tasks. Alternatively, vNPs-SDE can be implemented with conditional neural processes (CNPs) or attentive neural processes (ANPs), which have the property of permutation invariance, and exceeds vanilla SDE-Net in multidimensional regression tasks.

Highlights

  • Deep learning models have achieved great success in many fields, such as image classification [1], computer vision [2], machine translation [3], and reinforcement learning [4]

  • Experimental results show that the proposed vNPs-SDE model can be implemented with convolutional conditional neural processes (ConvCNPs), which have the property of translation equivariance, and can effectively handle the ID data with missing rates for one-dimensional (1D) regression and two-dimensional (2D) image classification tasks

  • VNPs-SDE can be implemented with conditional neural processes (CNPs) or attentive neural processes (ANPs), which have the property of permutation invariance, and exceeds vanilla SDE-Net in multidimensional regression tasks

Read more

Summary

Introduction

Deep learning models have achieved great success in many fields, such as image classification [1], computer vision [2], machine translation [3], and reinforcement learning [4]. Existing studies have shown that deep neural networks (DNNs) models are usually miscalibrated and overconfident in their predictions, which can result in misleading decisions for out-of-distribution (OOD). Bayesian neural networks (BNNs) methods were once regarded as a gold standard for uncertainty estimation in machine learning models [6,7], and the recent benchmark. In order to improve efficiency, the existing studies adopt the linear subspace feature extracting method of principal component analysis (PCA) in order to construct the parameter subspace of DNNs for a Bayesian inference [9], and the curve parameter subspace method is proposed to build a rich subspace containing diverse, high-performing models [10]. To approximate the Bayesian inference method, dropout in NNs can be interpreted as an approximation of the Gaussian process (GP), and dropout variational inference (DVI) can be an approximate Bayesian inference approach for large and complex DNN models [12]

Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call