Probabilistic Autoencoder Using Fisher Information.

Johannes Zacherl,Philipp Frank,Torsten A Enßlin

doi:10.3390/e23121640

Johannes Zacherl, Philipp Frank + Show 1 more

Open Access

PDF Available

https://doi.org/10.3390/e23121640

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Neural networks play a growing role in many scientific disciplines, including physics. Variational autoencoders (VAEs) are neural networks that are able to represent the essential information of a high dimensional data set in a low dimensional latent space, which have a probabilistic interpretation. In particular, the so-called encoder network, the first part of the VAE, which maps its input onto a position in latent space, additionally provides uncertainty information in terms of variance around this position. In this work, an extension to the autoencoder architecture is introduced, the FisherNet. In this architecture, the latent space uncertainty is not generated using an additional information channel in the encoder but derived from the decoder by means of the Fisher information metric. This architecture has advantages from a theoretical point of view as it provides a direct uncertainty quantification derived from the model and also accounts for uncertainty cross-correlations. We can show experimentally that the FisherNet produces more accurate data reconstructions than a comparable VAE and its learning performance also apparently scales better with the number of latent space dimensions.

Highlights

Machine learning has become a key method for data analysis [1]
We showed above that the FisherNet has an improved reconstruction accuracy compared to the standard Variational autoencoders (VAEs) and converges better for high dimensional latent spaces
We derived a new variant of VAEs by expanding the scope of the variational inference beyond the mean-field approach

Summary

Introduction

Machine learning has become a key method for data analysis [1]. Many machine learning methods can be classified as supervised or unsupervised learning. The machine learning model is trained to output a specific feature of the data, for example, to classify the data into some predetermined categories. To train a supervised model, we rely on training data, which are labeled with the features we want to learn. The labels need to be attached to the data manually, and the number of labeled data sets is limited. In unsupervised learning, the machine learning model is trained to learn patterns from the data directly, without any predetermined features

Methods

Results

Conclusion