Abstract

Restricted Boltzmann Machines (RBMs) and models derived from them have been successfully used as basic building blocks in deep artificial neural networks for automatic features extraction, unsupervised weights initialization, but also as density estimators. Thus, their generative and discriminative capabilities, but also their computational time are instrumental to a wide range of applications. Our main contribution is to look at RBMs from a topological perspective, bringing insights from network science. Firstly, here we show that RBMs and Gaussian RBMs (GRBMs) are bipartite graphs which naturally have a small-world topology. Secondly, we demonstrate both on synthetic and real-world datasets that by constraining RBMs and GRBMs to a scale-free topology (while still considering local neighborhoods and data distribution), we reduce the number of weights that need to be computed by a few orders of magnitude, at virtually no loss in generative performance. Thirdly, we show that, for a fixed number of weights, our proposed sparse models (which by design have a higher number of hidden neurons) achieve better generative capabilities than standard fully connected RBMs and GRBMs (which by design have a smaller number of hidden neurons), at no additional computational costs.

Highlights

  • Since its conception, deep learning (Bengio 2009) is widely studied and applied, from pure academic research to large-scale industrial applications, due to its success in different real-world machine learning problems such as audio recognition (Lee et al 2009), reinforcement learning (Mnih et al 2015), transfer learning (Ammar et al 2013), and activity recognition (Mocanu et al 2015)

  • The main contribution of this paper is to look at the deep learning basic building blocks, i.e. Restricted Boltzmann Machines (RBMs) and Gaussian RBMs (GRBMs) (Hinton and Salakhutdinov 2006), from a topological perspective, bringing insights from network science, an extension of graph theory which analyzes real world complex networks (Strogatz 2001)

  • In the last two sets of experiments, we compare Gaussian compleX Boltzmann Machine (GXBM)/XBM against three other methods, as follows: (1) the standard fully connected GRBM/RBM; (2) sparse GRBM/RBM models, denoted further GRBMFixProb (Fixed Probability)/RBMFixProb, in which the probability for any possible connection to exist is set to the number of weights of the counterpart GXBM/XBM model divided by the total number of possible connection for that specific configuration of hidden and visible neurons;4 and (3) sparse GRBM/RBM models, denoted further GRBMTrPrTr (Train Prune Train)/RBMTrPrTr, in which the sparsity is obtained using the algorithm introduced in Han et al (2015) with L2 regularization, and in which the weights sparsity target is set to the number of weights of the counterpart GXBM/XBM model

Read more

Summary

Introduction

Deep learning (Bengio 2009) is widely studied and applied, from pure academic research to large-scale industrial applications, due to its success in different real-world machine learning problems such as audio recognition (Lee et al 2009), reinforcement learning (Mnih et al 2015), transfer learning (Ammar et al 2013), and activity recognition (Mocanu et al 2015). Deep learning models are artificial neural networks with multiple layers of hidden neurons, which have connections only among neurons belonging to consecutive layers, but have no connections within the same layers These models are composed by basic building blocks, such as Restricted Boltzmann Machines (RBMs) (Smolensky 1987). To formalize a Boltzmann machine, and its variants, three main ingredients are required, namely an energy function providing scalar values for a given configuration of the network, the probabilistic inference and the learning rules required for fitting the free parameters This bidirectional connected network with stochastic nodes has no unit connected with itself. The model architecture was restricted by not allowing intra-layer connections between the units, as depicted in Fig. 2 (left) Since their conception, different types of Boltzmann machines have been developed and successfully applied.

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call