Abstract

Research has not yet reached a consensus on why humans match probabilities instead of maximise in a probability learning task. The most influential explanation is that they search for patterns in the random sequence of outcomes. Other explanations, such as expectation matching, are plausible, but do not consider how reinforcement learning shapes people’s choices. We aimed to quantify how human performance in a probability learning task is affected by pattern search and reinforcement learning. We collected behavioural data from 84 young adult participants who performed a probability learning task wherein the majority outcome was rewarded with 0.7 probability, and analysed the data using a reinforcement learning model that searches for patterns. Model simulations indicated that pattern search, exploration, recency (discounting early experiences), and forgetting may impair performance. Our analysis estimated that 85% (95% HDI [76, 94]) of participants searched for patterns and believed that each trial outcome depended on one or two previous ones. The estimated impact of pattern search on performance was, however, only 6%, while those of exploration and recency were 19% and 13% respectively. This suggests that probability matching is caused by uncertainty about how outcomes are generated, which leads to pattern search, exploration, and recency.

Highlights

  • Only Plonsky et al.[13] have attempted to estimate working memory usage in a reinforcement learning task, but when they used models in which pattern search was the main cause of suboptimal choices, they predicted large k values that lie beyond working memory capacity and generate extremely hard learning problems

  • We modelled our data with a reinforcement learning model that searches for patterns, the Markov pattern learning (MPL) model

  • Our work has made novel quantitative and conceptual contributions to the study of human decision making. It confirmed that in a probability learning task the vast majority of participants search for patterns in the outcome sequence, and made the novel estimation that participants believe that each outcome depends on one or two previous ones

Read more

Summary

Objectives

There are many plausible mechanisms for probability matching, and it is possible that human performance is affected by more than one. Our primary aim was to quantify the effects of pattern search, forgetting, exploration, and recency on human performance in a probability learning task. Our secondary aim was to estimate k, a measure of working memory usage in pattern search, which determines how complex are the patterns people search for. This is important because, as discussed above, searching for complex patterns impairs performance by creating a tendency to make decisions based on few past observations[13] and by interacting with forgetting. Only Plonsky et al.[13] have attempted to estimate working memory usage in a reinforcement learning task, but when they used models in which pattern search was the main cause of suboptimal choices, they predicted large k values that lie beyond working memory capacity and generate extremely hard learning problems

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.