We develop a novel data-driven approach to the inverse problem of classical statistical mechanics: Given the experimental data on the collective motion of a classical many-body system, how does one characterize the free energy landscape of that system? By combining non-parametric Bayesian inference with physically motivated constraints, we develop an efficient learning algorithm that automates the construction of approximate free-energy functionals. In contrast to optimization-based machine learning approaches, which seek to minimize a cost function, the central idea of the proposed Bayesian inference is to propagate a set of prior assumptions through the model, derived from physical principles. The experimental data are used to probabilistically weigh the possible model predictions. This naturally leads to humanly interpretable algorithms with full uncertainty quantification of predictions. In our case, the output of the learning algorithm is a probability distribution over a family of free energy functionals, consistent with the observed particle data. We find that surprisingly small data samples contain sufficient information for inferring highly accurate analytic expressions of the underlying free-energy functionals, making our algorithm highly data efficient. In particular, we consider classical particle systems with excluded volume interactions, which are ubiquitous in nature, while being highly challenging in terms of free energy modeling. We validate our approach on the paradigmatic case of one-dimensional fluid and develop inference algorithms for the canonical and grand-canonical statistical-mechanical ensembles. Extensions to higher dimensional systems are conceptually straightforward, while standard coarse-graining techniques allow one to easily incorporate attractive interactions.
Read full abstract