Abstract

Biological agents demonstrate a remarkable proficiency in calibrating appropriate scales of planning and evaluation when interacting with their environments. It follows logically that any decision-making algorithm aspiring to neurobiological plausibility must mirror these attributes, particularly regarding computational expenditure and the intricacy of evaluative processes. However, active inference encounters notable challenges in simulating apt behaviours within complex environments. These stem chiefly from its substantial computational demands and the intricate task of defining the agent’s behaviour preference. We address these through a two-fold approach. First, we introduce a planning algorithm by using the Bellman-optimality principle to minimise the planning cost function (i.e., expected free energy). Briefly, we recursively compute the expected free energy of actions in reverse temporal sequence to significantly reduce the computational complexity. Secondly, inspired by the Z-learning algorithm, we propose a novel method to learn time-constrained agent preferences. We face-validate the efficacy of these through grid-world simulations and demonstrate precise model learning and planning, even under uncertainty. These algorithmic advances create new opportunities for various applications—in neuroscience and machine learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.