The Partial Information Decomposition of Generative Neural Network Models

Tycho Tax,Pedro Mediano,Murray Shanahan

doi:10.3390/e19090474

Abstract

In this work we study the distributed representations learnt by generative neural network models. In particular, we investigate the properties of redundant and synergistic information that groups of hidden neurons contain about the target variable. To this end, we use an emerging branch of information theory called partial information decomposition (PID) and track the informational properties of the neurons through training. We find two differentiated phases during the training process: a first short phase in which the neurons learn redundant information about the target, and a second phase in which neurons start specialising and each of them learns unique information about the target. We also find that in smaller networks individual neurons learn more specific information about certain features of the input, suggesting that learning pressure can encourage disentangled representations.

Highlights

Neural networks are famously known for their excellent performance, yet are infamously known for their thin theoretical grounding
We used a stochastic binarised version of MNIST—every time an image was fed as input to the network, the value of each pixel was sampled from a binomial distribution with a probability equal to the normalised intensity of that pixel
The gradients were estimated with contrastive divergence [24] and the weights were optimised with vanilla stochastic gradient descent with fixed learning rate (0.01)

Summary

Introduction

Neural networks are famously known for their excellent performance, yet are infamously known for their thin theoretical grounding. While common deep learning “tricks” that are empirically proven successful tend to be later discovered to have a theoretical justification (e.g., the Bayesian interpretation of dropout [1,2]), deep learning research still operates “in the dark” and is guided almost exclusively by empirical performance. One common topic in learning theory is the study of data representations, and in the case of deep learning it is the hierarchy of such representations that is often hailed as the key to neural networks’ success [3]. A representation can be said to be disentangled if it has factorisable or compositional structure, and has consistent semantics associated to different generating factors of the underlying data generation process.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Sep 6, 2017
Citations: 51	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

The Partial Information Decomposition of Generative Neural Network Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Sampling bias corrections for accurate neural measures of redundant, unique, and synergistic information.
Loren Kocillari ... Loren Koçillari
bioRxiv : the preprint server for biology | VOL. -
Loren Kocillari, et. al.Loren Kocillari ... Loren Koçillari
05 Jun 2024
bioRxiv : the preprint server for biology | VOL. -

Speech listening entails neural encoding of invisible articulatory features
A Pastore ... A D'Ausilio
NeuroImage | VOL. 264
A Pastore, et. al.A Pastore ... A D'Ausilio
31 Oct 2022
NeuroImage | VOL. 264

Pooling probability distributions and partial information decomposition.
S J Van Enk
Physical review. E | VOL. 107
S J Van EnkS J Van Enk
25 May 2023
Physical review. E | VOL. 107

Signed and unsigned partial information decompositions of continuous network interactions
Jesse Milzman ... Xx Ernesto Estrada
Journal of Complex Networks | VOL. 10
Jesse Milzman, et. al.Jesse Milzman ... Xx Ernesto Estrada
23 Aug 2022
Journal of Complex Networks | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Partial Information Decomposition of Generative Neural Network Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy