Abstract

Value-based decisions are often guided by past experience. If a choice led to a good outcome, we are more likely to repeat it. This basic idea is well-captured by reinforcement-learning models. However, open questions remain about how we assign value to options we did not choose and which we therefore never had the chance to learn about directly. One solution to this problem is proposed by policy gradient reinforcement-learning models; these do not require direct learning of value, instead optimizing choices according to a behavioral policy. For example, a logistic policy predicts that if a chosen option was rewarded, the unchosen option would be deemed less desirable. Here, we test the relevance of these models to human behavior and explore the role of memory in this phenomenon. We hypothesize that a policy may emerge from an associative memory trace formed during deliberation between choice options. In a preregistered study (n = 315) we show that people tend to invert the value of unchosen options relative to the outcome of chosen options, a phenomenon we term inverse decision bias. The inverse decision bias is correlated with memory for the association between choice options; moreover, it is reduced when memory formation is experimentally interfered with. Finally, we present a new memory-based policy gradient model that predicts both the inverse decision bias and its dependence on memory. Our findings point to a significant role of associative memory in valuation of unchosen options and introduce a new perspective on the interaction between decision-making, memory, and counterfactual reasoning. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call