Abstract

Most accounts of behavior in nonhuman animals assume that they make choices to maximize expected reward value. However, model-free reinforcement learning based on reward associations cannot account for choice behavior in transitive inference paradigms. We manipulated the amount of reward associated with each item of an ordered list, so that maximizing expected reward value was always in conflict with decision rules based on the implicit list order. Under such a schedule, model-free reinforcement algorithms cannot achieve high levels of accuracy, even after extensive training. Monkeys nevertheless learned to make correct rule-based choices. These results show that monkeys' performance in transitive inference paradigms is not driven solely by expected reward and that appropriate inferences are made despite discordant reward incentives. We show that their choices can be explained by an abstract, model-based representation of list order, and we provide a method for inferring the contents of such representations from observed data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.