Watch and learn—a generalized approach for transferrable learning in deep neural networks via physical principles

Kyle Sprague,Juan Carrasquilla,Isaac Tamblyn,Stephen Whitelam

doi:10.1088/2632-2153/abc81b

Abstract

Transfer learning refers to the use of knowledge gained while solving a machine learning task and applying it to the solution of a closely related problem. Such an approach has enabled scientific breakthroughs in computer vision and natural language processing where the weights learned in state-of-the-art models can be used to initialize models for other tasks which dramatically improve their performance and save computational time. Here we demonstrate an unsupervised learning approach augmented with basic physical principles that achieves fully transferrable learning for problems in statistical physics across different physical regimes. By coupling a sequence model based on a recurrent neural network to an extensive deep neural network, we are able to learn the equilibrium probability distributions and inter-particle interaction models of classical statistical mechanical systems. Our approach, distribution-consistent learning, DCL, is a general strategy that works for a variety of canonical statistical mechanical models (Ising and Potts) as well as disordered interaction potentials. Using data collected from a single set of observation conditions, DCL successfully extrapolates across all temperatures, thermodynamic phases, and can be applied to different length-scales. This constitutes a fully transferrable physics-based learning in a generalizable approach.

Highlights

Machine learning has emerged as a powerful tool in the physical sciences, seeing both rapid adoption and experimentation in recent years
Within the field of statistical mechanics, machine learning was recently used to estimate the value of the partition function [5], solve canonical models [6], and generative models conditioned on the Boltzmann distribution have been shown to be efficient at sampling statistical mechanical ensembles[7]
Extensive Deep Neural Networks (EDNN) topologies require that physical laws are the same everywhere. They are designed to learn a function which, when applied across a configuration, maps the sum of outputs to an extensive quantity such as the internal energy. In this case, we find that this physics-based network design requirement results in improved performance in predicting the underlying interactions even when the labels used in training have noise introduced by the imperfect recurrent neural networks (RNN) energy model

Summary

INTRODUCTION

Machine learning has emerged as a powerful tool in the physical sciences, seeing both rapid adoption and experimentation in recent years. This is because for spin flips at some locations in the lattice one may need to recompute only some conditionals, but in general one has to recompute an extensive number of them Another limitation is that the RNN energy model is only able to make predictions for system sizes which are the same as those within the training set. They are designed to learn a function which, when applied across a configuration, maps the sum of outputs to an extensive quantity such as the internal energy In this case, we find that this physics-based network design requirement results in improved performance in predicting the underlying interactions even when the labels used in training have noise introduced by the imperfect RNN energy model. EDNN enforce the postulate of uniformity of physical law into our training procedure; our performance improves as a result

A GENERAL METHOD

CONCLUSION