Abstract

The restricted Boltzmann machine (RBM), which is a graphical model for binary random variables, has been proven to be a powerful tool in machine learning. However, theoretical foundations for understanding the approximation ability of RBMs are lacking. In this paper, we study the representational power of RBMs, with the focus lying on the sufficient number of hidden units of RBMs required to compute some classes of distributions of interest, with a fixed number of inputs. First, it is constructively shown how RBMs can approximate any distribution that depends on the scalar projection of the inputs onto a given vector up to arbitrary accuracy. Then, for any given distribution, we explore how it can be represented as the form that depends on the scalar projection of the inputs onto some vectors, and then study the properties of these vectors, from which a new proof for the universal approximation theorem of RBMs is deduced. Finally, we investigate the representational efficiency of RBMs by providing a description of all the distributions that can be efficiently computed by RBMs. More specifically, it is shown that a distribution can be computed by a polynomial-size RBM with polynomially bounded parameters, if and only if its mass can be computed by a two-layer feedforward network with threshold/ReLU activation functions, whose size and parameters are polynomially bounded.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call