Accelerating deep learning with memcomputing

Haik Manukian,Fabio L Traversa,Massimiliano Di Ventra

doi:10.1016/j.neunet.2018.10.012

Haik Manukian, Fabio L Traversa + Show 1 more

Open Access

https://doi.org/10.1016/j.neunet.2018.10.012

Copy DOI

Abstract

Restricted Boltzmann machines (RBMs) and their extensions, often called “deep-belief networks”, are powerful neural networks that have found applications in the fields of machine learning and artificial intelligence. The standard way to train these models resorts to an iterative unsupervised procedure based on Gibbs sampling, called “contrastive divergence”, and additional supervised tuning via back-propagation. However, this procedure has been shown not to follow any gradient and can lead to suboptimal solutions. In this paper, we show an efficient alternative to contrastive divergence by means of simulations of digital memcomputing machines (DMMs) that compute the gradient of the log-likelihood involved in unsupervised training. We test our approach on pattern recognition using a modified version of the MNIST data set of hand-written numbers. DMMs sample effectively the vast phase space defined by the probability distribution of RBMs, and provide a good approximation close to the optimum. This efficient search significantly reduces the number of generative pretraining iterations necessary to achieve a given level of accuracy in the MNIST data set, as well as a total performance gain over the traditional approaches. In fact, the acceleration of the pretraining achieved by simulating DMMs is comparable to, in number of iterations, the recently reported hardware application of the quantum annealing method on the same network and data set. Notably, however, DMMs perform far better than the reported quantum annealing results in terms of quality of the training. Finally, we also compare our method to recent advances in supervised training, like batch-normalization and rectifiers, that seem to reduce the advantage of pretraining. We find that the memcomputing method still maintains a quality advantage (>1% in accuracy, corresponding to a 20% reduction in error rate) over these approaches, despite the network pretrained with memcomputing defines a more non-convex landscape using sigmoidal activation functions without batch-normalization. Our approach is agnostic about the connectivity of the network. Therefore, it can be extended to train full Boltzmann machines, and even deep networks at once.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neural Networks	Publication Date: Nov 3, 2018
Citations: 26	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Accelerating deep learning with memcomputing

Abstract

Talk to us

Similar Papers

More From: Neural Networks

Lead the way for us

Similar Papers

Variational Autoencoders using D-Wave Quantum Annealing

-

10 Dec 2018
10 Dec 2018

Class sparsity signature based Restricted Boltzmann Machine
Anush Sankaran ... Angshul Majumdar
Pattern Recognition | VOL. 61
Anush Sankaran, et. al.Anush Sankaran ... Angshul Majumdar
12 May 2016
Pattern Recognition | VOL. 61

Analysis of Different Sparsity Methods in Constrained RBM for Sparse Representation in Cognitive Robotic Perception
Zongyong Cui ... Hongliang Ren
Journal of Intelligent & Robotic Systems | VOL. 80
Zongyong Cui, et. al.Zongyong Cui ... Hongliang Ren
12 Feb 2015
Journal of Intelligent & Robotic Systems | VOL. 80

Quantum Computation For Electronic Structure Calculations

-

15 Dec 2020
15 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accelerating deep learning with memcomputing

Abstract

Talk to us

Similar Papers

More From: Neural Networks