Models that learn how humans learn: The case of decision-making and its disorders.

Amir Dezfouli,Fabio Ramos,Kristi Griffiths,Peter Dayan,Bernard W Balleine,Jakob H Macke

doi:10.1371/journal.pcbi.1006903

Abstract

Popular computational models of decision-making make specific assumptions about learning processes that may cause them to underfit observed behaviours. Here we suggest an alternative method using recurrent neural networks (RNNs) to generate a flexible family of models that have sufficient capacity to represent the complex learning and decision- making strategies used by humans. In this approach, an RNN is trained to predict the next action that a subject will take in a decision-making task and, in this way, learns to imitate the processes underlying subjects’ choices and their learning abilities. We demonstrate the benefits of this approach using a new dataset drawn from patients with either unipolar (n = 34) or bipolar (n = 33) depression and matched healthy controls (n = 34) making decisions on a two-armed bandit task. The results indicate that this new approach is better than baseline reinforcement-learning methods in terms of overall performance and its capacity to predict subjects’ choices. We show that the model can be interpreted using off-policy simulations and thereby provides a novel clustering of subjects’ learning processes—something that often eludes traditional approaches to modelling and behavioural analysis.

Highlights

A computational model of decision-making is a mathematical function that inputs past experiences—such as chosen actions and the value of rewards—and outputs predictions about future actions [e.g. 1, 2, 3]
Designing a computational model is often based on manual engineering with an iterative process to examine the consistency between different aspects of the model and the empirical data
We developed a recurrent neural network (RNNs) as a flexible type of model that can automatically characterize human decision-making processes without requiring tweaking and engineering

Summary

Introduction

A computational model of decision-making is a mathematical function that inputs past experiences—such as chosen actions and the value of rewards—and outputs predictions about future actions [e.g. 1, 2, 3]. If the actual learning and choice processes used by real human subjects differ from those assumptions, e.g., if a single learning-rate parameter is assumed to update the effects of reward and punishment on action values when they are modulated by different learning-rates, the model will misfit the data [e.g., 4]. To overcome this problem, computational modelling often involves an iterative process that includes additional analyses to assess assumptions about model behavior, subsequent emendation of the structural features of the model to reduce residual fitting error, new analyses, and so forth. This is because in each iteration the unexplained variance in the data can be either attributed to the natural randomness of humans actions, which implies that no further model improvement is required, or to the lack of a mechanism in the model to absorb the remaining variance, which implies that further iterations are required. (ii) Even if it is believed that further iterations are required, improving the model will be mostly based on manual engineering in the hope of finding a new mechanism that, when added to the model, provides a better explanation for the data

Methods

Results

Discussion

Conclusion