Abstract

Converging evidence has demonstrated that humans exhibit two distinct strategies when learning in complex environments. One is model-free learning, i.e., simple reinforcement of rewarded actions, and the other is model-based learning, which considers the structure of the environment. Recent work has argued that people exhibit little model-based behavior unless it leads to higher rewards. Here we use mouse tracking to study model-based learning in stochastic and deterministic (pattern-based) environments of varying difficulty. In both tasks participants’ mouse movements reveal that they learned the structures of their environments, despite the fact that standard behavior-based estimates suggested no such learning in the stochastic task. Thus, we argue that mouse tracking can reveal whether subjects have structure knowledge, which is necessary but not sufficient for model-based choice.

Highlights

  • Converging evidence has demonstrated that humans exhibit two distinct strategies when learning in complex environments

  • We find that mouse tracking can reveal individuals’ subjective beliefs and we demonstrate that even though individuals learn the task structure, their choices do not necessarily become model-based

  • For the model-free case, the first-stage choice is more likely to be repeated if the previous trial yielded a high reward, regardless of the transition type (Fig. 2a)

Read more

Summary

Introduction

Converging evidence has demonstrated that humans exhibit two distinct strategies when learning in complex environments. While the initial studies of model-based behavior involved a popular two-stage Markov decision task with stochastic action-state contingencies[1,17,23,26,27], recent evidence suggests that model-based strategies do not typically lead to higher rewards in these probabilistic tasks. It is unclear whether a lack of model-based behavior reflects an inability to learn the structure of the environment or indifference towards the model-based strategy. We reasoned that it should be possible to use mouse trajectories to infer how strongly decisionmakers expect particular outcomes

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.