Abstract
Adaptive systems—such as a biological organism gaining survival advantage, an autonomous robot executing a functional task, or a motor protein transporting intracellular nutrients—must somehow embody relevant regularities and stochasticity in their environments to take full advantage of thermodynamic resources. Analogously, but in a purely computational realm, machine learning algorithms estimate models to capture predictable structure and identify irrelevant noise in training data. This happens through optimization of performance metrics, such as model likelihood. If such learning is physically implemented, is there a sense in which computational models estimated through machine learning are physically preferred? We introduce the thermodynamic principle that work production is the most relevant performance measure for an adaptive physical agent and compare the results to the maximum-likelihood principle that guides machine learning. Within the class of physical agents that most efficiently harvest energy from their environment, we demonstrate that an efficient agent’s model explicitly determines its architecture and how much useful work it harvests from the environment. We then show that selecting the maximum-work agent for given environmental data corresponds to finding the maximum-likelihood model. This establishes an equivalence between nonequilibrium thermodynamics and dynamic learning. In this way, work maximization emerges as an organizing principle that underlies learning in adaptive thermodynamic systems.
Highlights
A debate has carried on for the last century and a half over the relationship between abiotic physical processes and intelligence
The following introduces what we need for this: concepts from machine learning, computational mechanics, and thermodynamic computing
Substituting zero entropy production into Eq (2), we arrive at our result: work production for thermodynamically-efficient computations is the change in pointwise nonequilibrium free energy: W eff
Summary
A debate has carried on for the last century and a half over the relationship (if any) between abiotic physical processes and intelligence. Most prosaically, translating training data into a model corresponds to density estimation [5], where the algorithm uses the data to construct a probability distribution This type of model-building at first appears far afield from more familiar machine learning tasks such as categorizing pet pictures into cats and dogs or generating a novel image of a giraffe from a photo travelogue. To carry out density estimation, machine learning invokes the principle of maximum-likelihood to guide intelligent learning This says, of the possible models consistent with the training data, an algorithm should select that with maximum probability of having generated the data. While it is natural to argue that learning confers benefits, our result establishes that the benefit is fundamentally rooted in the physics of energy and information Once these central results are presented and their interpretation explained, but before we conclude, we briefly recount the long-lived narrative of the thermodynamics of organization. The use of work as a measure of learning performance is explored, deriving the equivalence between the conditions of maximum work and maximum likelihood
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have