Hierarchical generative models can produce data samples based on the statistical structure of their training distribution. This capability can be linked to current theories in computational neuroscience, which propose that spontaneous brain activity at rest is the manifestation of top-down dynamics of generative models detached from action-perception cycles. A popular class of hierarchical generative models is that of Deep Belief Networks (DBNs), which are energy-based deep learning architectures that can learn multiple levels of representations in a completely unsupervised way exploiting Hebbian-like learning mechanisms. In this work, we study the generative dynamics of a recent extension of the DBN, the iterative DBN (iDBN), which more faithfully simulates neurocognitive development by jointly tuning the connection weights across all layers of the hierarchy. We characterize the number of states visited during top-down sampling and investigate whether the heterogeneity of visited attractors could be increased by initiating the generation process from biased hidden states. To this end, we train iDBN models on well-known datasets containing handwritten digits and pictures of human faces, and show that the ability to generate diverse data prototypes can be enhanced by initializing top-down sampling from “chimera states”, which represent high-level features combining multiple abstract representations of the sensory data. Although the models are not always able to transition between all potential target states within a single-generation trajectory, the iDBN shows richer top-down dynamics in comparison to a shallow generative model (a single-layer Restricted Bolzamann Machine). We further show that the generated samples can be used to support continual learning through generative replay mechanisms. Our findings suggest that the top-down dynamics of hierarchical generative models is significantly influenced by the shape of the energy function, which depends both on the depth of the processing architecture and on the statistical structure of the sensory data.
Read full abstract