Abstract

Recently authors have introduced the idea of training discrete weights neural networks using a mix between classical simulated annealing and a replica ansatz known from the statistical physics literature. Among other points, they claim their method is able to find robust configurations. In this paper, we analyze this so called “replicated simulated annealing” algorithm. In particular, we give criteria to guarantee its convergence, and study when it successfully samples from configurations. We also perform experiments using synthetic and real data bases.

Highlights

  • In the past few years, there has been a growing interest in finding methods to train discrete weights neural networks

  • The outline is as follows: in Sect. 2 we mathematically formalize and describe the algorithm of Replicated Simulated Annealing, in Sect. 3 we study its convergence properties

  • When training a binary logistic regression classifier on MNIST using Replicated Simulated Annealing, we typically achieve a 88% accuracy on the test set, which is on par with the performance obtained with continuous weights and gradient descent

Read more

Summary

Introduction

In the past few years, there has been a growing interest in finding methods to train discrete weights neural networks. As a matter of fact, when it comes to implementations, Communicated by Hal Tasaki.

51 Page 2 of 22
51 Page 4 of 22
Replicated Simulated Annealing
51 Page 6 of 22
Convergence of the Annealing Process
51 Page 8 of 22
51 Page 12 of 22
51 Page 14 of 22
Experiments
MNIST Dataset
51 Page 16 of 22
Influence of
Robustness of Trained Models
51 Page 18 of 22
Conclusion
Findings
51 Page 22 of 22
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call