Stimulus discriminability may bias value-based probabilistic learning

Iris Schutte,Anne G E Collins,Heleen A Slagter,Michael J Frank,J Leon Kenemans,Tom Verguts

doi:10.1371/journal.pone.0176205

Abstract

Reinforcement learning tasks are often used to assess participants’ tendency to learn more from the positive or more from the negative consequences of one’s action. However, this assessment often requires comparison in learning performance across different task conditions, which may differ in the relative salience or discriminability of the stimuli associated with more and less rewarding outcomes, respectively. To address this issue, in a first set of studies, participants were subjected to two versions of a common probabilistic learning task. The two versions differed with respect to the stimulus (Hiragana) characters associated with reward probability. The assignment of character to reward probability was fixed within version but reversed between versions. We found that performance was highly influenced by task version, which could be explained by the relative perceptual discriminability of characters assigned to high or low reward probabilities, as assessed by a separate discrimination experiment. Participants were more reliable in selecting rewarding characters that were more discriminable, leading to differences in learning curves and their sensitivity to reward probability. This difference in experienced reinforcement history was accompanied by performance biases in a test phase assessing ability to learn from positive vs. negative outcomes. In a subsequent large-scale web-based experiment, this impact of task version on learning and test measures was replicated and extended. Collectively, these findings imply a key role for perceptual factors in guiding reward learning and underscore the need to control stimulus discriminability when making inferences about individual differences in reinforcement learning.

Highlights

Reinforcement learning refers to the ability of humans and other animals to learn from the outcome of their actions
We found that task version had strong effects on learning curves, such that subjects were more reliable in selecting rewarding characters that were more discriminable, leading to differences in experienced reward probability for the critical items across versions
Replicating the results of experiment 1a, performance in the AB pair was significantly better compared to both of the other pairs in task version 1 (p’s .003), whereas in version 2 AB performance did not differ from CD (p =. 71) while performance for both the AB and CD pair was enhanced relative to the EF pair (p’s .004)

Summary

Introduction

Reinforcement learning refers to the ability of humans and other animals to learn from the outcome of their actions. All ten participants subjected to task version 2 were classified as negative learner, because of their greater accuracy in avoiding stimulus B in novel test pairs, compared to choosing A.

Objectives

Results

Conclusion