Abstract

This paper describes a variational free-energy formulation of (partially observable) Markov decision problems in decision making under uncertainty. We show that optimal control can be cast as active inference. In active inference, both action and posterior beliefs about hidden states minimise a free energy bound on the negative log-likelihood of observed states, under a generative model. In this setting, reward or cost functions are absorbed into prior beliefs about state transitions and terminal states. Effectively, this converts optimal control into a pure inference problem, enabling the application of standard Bayesian filtering techniques. We then consider optimal trajectories that rest on posterior beliefs about hidden states in the future. Crucially, this entails modelling control as a hidden state that endows the generative model with a representation of agency. This leads to a distinction between models with and without inference on hidden control states; namely, agency-free and agency-based models, respectively.

Highlights

  • In this work, we apply variational free-energy minimisation to a well-studied problem in optimal decision theory, psychology and machine learning; namely, Markov decision processes

  • The purpose of the second section is to resolve this conflict: in brief, we will see that the cost functions that are used to guide action in optimal control theory can be absorbed into prior beliefs in active inference

  • We describe a scheme for partially observable Markov decision processes that optimises action in relation to prior beliefs about future states

Read more

Summary

Introduction

We apply variational free-energy minimisation to a well-studied problem in optimal decision theory, psychology and machine learning; namely, Markov decision processes. The purpose of the second section is to resolve this conflict: in brief, we will see that the cost functions that are used to guide action in optimal control theory can be absorbed into prior beliefs in active inference This means that agents expect their state transitions to minimise cost, while action realises these prior beliefs by maximising the marginal likelihood of observations. The third section illustrates this by showing how optimal policies can be inferred under prior beliefs about future (terminal) states using standard variational Bayesian procedures (Beal 2003) This example leads to a model-based optimisation of behaviour that may provide a useful metaphor for planning, anticipation and a sense of agency in real-world agents. We conclude with an example (the mountain car problem) that illustrates how active inference furnishes online non-linear optimal control, with partially observed (hidden) states that are subject to random fluctuations

Markov decision processes
Partially observable Markov decision processes
Optimal control as inference
Active inference
Perception and action
Optimality and complete class theorems
Bayes-optimal control without cost functions
Agency-based optimisation
Simulations: the mountain car problem
Simulation setup
Discussion
Learning versus inference
Suboptimal control and psychopathology
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call