Structure Learning in Human Sequential Decision-Making

Daniel E Acuña,Paul Schrater

doi:10.1371/journal.pcbi.1001003

Daniel E Acuña, Paul Schrater

Open Access

https://doi.org/10.1371/journal.pcbi.1001003

Copy DOI

Journal: PLoS Computational Biology	Publication Date: Dec 2, 2010
Citations: 55	License type: CC BY 4.0

Affiliation: University of Minnesota

Abstract

Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. We formulate the problem of structure learning in sequential decision tasks using Bayesian reinforcement learning, and show that learning the generative model for rewards qualitatively changes the behavior of an optimal learning agent. To test whether people exhibit structure learning, we performed experiments involving a mixture of one-armed and two-armed bandit reward models, where structure learning produces many of the qualitative behaviors deemed suboptimal in previous studies. Our results demonstrate humans can perform structure learning in a near-optimal manner.

Highlights

From a squirrel deciding where to bury its nuts to a scientist selecting the experiment, all decision-making organisms must balance exploration of alternatives against exploitation of known options in developing action plans
In an experimental test of structure learning in humans, we show that humans learn reward structure from experience in a near optimal manner
We argue that structure learning plays a major role in human sequential decision making

Summary

Introduction

From a squirrel deciding where to bury its nuts to a scientist selecting the experiment, all decision-making organisms must balance exploration of alternatives against exploitation of known options in developing action plans. Determining when exploration is profitable is itself a decision problem that requires understanding or learning about the statistical structure of the environment. Your aim is to maximize the total reward from the environment, but the difficulty is that the rate of reward for each option is unknown and must be learned. In this simple setting, there may be several hypothesis about how the reward generation process works—how actions, observations and unknowns are structurally ‘‘connected.’’ We propose three kinds of structures that capture several versions of sequential decisionmaking tasks available in the literature. The first structure has temporal dependency between the present probability of reward and the past probability of reward, investigated in the context of

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Structure Learning in Human Sequential Decision-Making

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology

Lead the way for us

Similar Papers

Structure learning in human sequential decision-making
Schrater Paul
Frontiers in Systems Neuroscience | VOL. 3
Schrater PaulSchrater Paul
01 Jan 2009
Frontiers in Systems Neuroscience | VOL. 3

Fostering human learning in sequential decision-making: Understanding the role of evaluative feedback.
Piyush Gupta ... Vaibhav Srivastava
PloS one | VOL. 19
Piyush Gupta, et. al.Piyush Gupta ... Vaibhav Srivastava
28 May 2024
PloS one | VOL. 19

Heuristic and optimal policy computations in the human brain during sequential decision-making
Christoph W Korn ... Dominik R Bach
Nature Communications | VOL. 9
Christoph W Korn, et. al.Christoph W Korn ... Dominik R Bach
23 Jan 2018
Nature Communications | VOL. 9

Monte Carlo Tree Search for Bayesian Reinforcement Learning
Ngo Anh Vien ... Wolfgang Ertel
-
Ngo Anh Vien, et. al.Ngo Anh Vien ... Wolfgang Ertel
01 Dec 2012
01 Dec 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Structure Learning in Human Sequential Decision-Making

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology