Decision making under uncertainty: a neural model based on partially observable markov decision processes.

Rajesh P N Rao

doi:10.3389/fncom.2010.00146

Abstract

A fundamental problem faced by animals is learning to select actions based on noisy sensory information and incomplete knowledge of the world. It has been suggested that the brain engages in Bayesian inference during perception but how such probabilistic representations are used to select actions has remained unclear. Here we propose a neural model of action selection and decision making based on the theory of partially observable Markov decision processes (POMDPs). Actions are selected based not on a single “optimal” estimate of state but on the posterior distribution over states (the “belief” state). We show how such a model provides a unified framework for explaining experimental results in decision making that involve both information gathering and overt actions. The model utilizes temporal difference (TD) learning for maximizing expected reward. The resulting neural architecture posits an active role for the neocortex in belief computation while ascribing a role to the basal ganglia in belief representation, value computation, and action selection. When applied to the random dots motion discrimination task, model neurons representing belief exhibit responses similar to those of LIP neurons in primate neocortex. The appropriate threshold for switching from information gathering to overt actions emerges naturally during reward maximization. Additionally, the time course of reward prediction error in the model shares similarities with dopaminergic responses in the basal ganglia during the random dots task. For tasks with a deadline, the model learns a decision making strategy that changes with elapsed time, predicting a collapsing decision threshold consistent with some experimental studies. The model provides a new framework for understanding neural decision making and suggests an important role for interactions between the neocortex and the basal ganglia in learning the mapping between probabilistic sensory representations and actions that maximize rewards.

Highlights

To survive in a constantly changing and uncertain environment, animals must solve the problem of learning to choose actions based on noisy sensory information and incomplete knowledge of the world
We propose a neural model for action selection and decision making that combines probabilistic representations of the environment with a reinforcement-based learning mechanism to select actions that maximize total expected future reward
The model leverages recent advances in three different fields: (1) neural models of Bayesian inference, (2) the theory of optimal decision making under uncertainty based on partially observable Markov decision processes (POMDPs), and (3) algorithms for temporal difference (TD) learning in reinforcement learning theory

Summary

Introduction

To survive in a constantly changing and uncertain environment, animals must solve the problem of learning to choose actions based on noisy sensory information and incomplete knowledge of the world. A number of computational models have been proposed to demonstrate how Bayesian inference could be performed in biologically plausible networks of neurons (Rao, 2004, 2005; Yu and Dayan, 2005; Zemel et al, 2005; Ma et al, 2006; Beck et al, 2008; Deneve, 2008). A question that has received less attention is how such probabilistic representations could be utilized to learn actions that maximize expected reward. We propose a neural model for action selection and decision making that combines probabilistic representations of the environment with a reinforcement-based learning mechanism to select actions that maximize total expected future reward. The model leverages recent advances in three different fields: (1) neural models of Bayesian inference, (2) the theory of optimal decision making under uncertainty based on partially observable Markov decision processes (POMDPs), and (3) algorithms for temporal difference (TD) learning in reinforcement learning theory

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Computational Neuroscience	Publication Date: Jan 1, 2010
Citations: 216	License type: cc-by

R Discovery Prime

R Discovery Prime

Decision making under uncertainty: a neural model based on partially observable markov decision processes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Computational Neuroscience

Lead the way for us

Similar Papers

Modeling Information Gathering Decisions in Systems Engineering Projects
Chuck Hsiao ... Richard Malak
-
Chuck Hsiao, et. al.Chuck Hsiao ... Richard Malak
17 Aug 2014
17 Aug 2014

Author response: Alternation emerges as a multi-modal strategy for turbulent odor navigation
Gautam Reddy ... Massimo Vergassola
-
Gautam Reddy, et. al.Gautam Reddy ... Massimo Vergassola
12 Jul 2022
12 Jul 2022

Partially observed Markov decision processes (POMDPs)
Vikram Krishnamurthy
-
Vikram KrishnamurthyVikram Krishnamurthy
01 Jan 2015
01 Jan 2015

Partially ObservableMDPs(POMDPS): Introduction and Examples
Emine Yaylali ... Julie S Ivy
-
Emine Yaylali, et. al.Emine Yaylali ... Julie S Ivy
01 Jan 2010
01 Jan 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Decision making under uncertainty: a neural model based on partially observable markov decision processes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Computational Neuroscience