Optimal training of Mean Variance Estimation neural networks

Laurens Sluijterman,Eric Cator,Tom Heskes

doi:10.1016/j.neucom.2024.127929

Abstract

This paper focusses on the optimal implementation of a Mean Variance Estimation network (MVE network) (Nix and Weigend, 1994). This type of network is often used as a building block for uncertainty estimation methods in a regression setting, for instance Concrete dropout (Gal et al., 2017) and Deep Ensembles (Lakshminarayanan et al., 2017). Specifically, an MVE network assumes that the data is produced from a normal distribution with a mean function and variance function. The MVE network outputs a mean and variance estimate and optimizes the network parameters by minimizing the negative loglikelihood.In our paper, we present two significant insights. Firstly, the convergence difficulties reported in recent work can be relatively easily prevented by following the simple yet often overlooked recommendation from the original authors that a warm-up period should be used. During this period, only the mean is optimized with a fixed variance. We demonstrate the effectiveness of this step through experimentation, highlighting that it should be standard practice. As a sidenote, we examine whether, after the warm-up, it is beneficial to fix the mean while optimizing the variance or to optimize both simultaneously. Here, we do not observe a substantial difference. Secondly, we introduce a novel improvement of the MVE network: separate regularization of the mean and the variance estimate. We demonstrate, both on toy examples, multiple benchmark UCI regression data sets, and on the UTKFace data set, that following the original recommendations and the novel separate regularization can lead to significant improvements.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neurocomputing	Publication Date: Jun 6, 2024
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Optimal training of Mean Variance Estimation neural networks

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Similar Papers

Geostatistical Estimation Variance for the Spatial Mean in Two-Dimensional Systematic Sampling
Philippe Aubry ... Domitien Debouzie
Ecology | VOL. 81
Philippe Aubry, et. al.Philippe Aubry ... Domitien Debouzie
01 Feb 2000
Ecology | VOL. 81

GEOSTATISTICAL ESTIMATION VARIANCE FOR THE SPATIAL MEAN IN TWO-DIMENSIONAL SYSTEMATIC SAMPLING
Philippe Aubry ... Domitien Debouzie
Ecology | VOL. 81
Philippe Aubry, et. al.Philippe Aubry ... Domitien Debouzie
01 Feb 2000
Ecology | VOL. 81

SPECIFYING SAMPLE MASSES FOR GRAVIMETRIC TESTING PROCEDURES
C C Balascio
Transactions of the ASAE | VOL. 34
C C Balascio C C Balascio
01 Jan 1991
Transactions of the ASAE | VOL. 34

Diagnosis of dermatophytosis in cats using artificial neural networks
А.А Bushmina ... I.V Kireev
Veterinaria i kormlenie | VOL. -
А.А Bushmina, et. al.А.А Bushmina ... I.V Kireev
01 Feb 2023
Veterinaria i kormlenie | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimal training of Mean Variance Estimation neural networks

Abstract

Talk to us

Similar Papers

More From: Neurocomputing