Human complex exploration strategies are enriched by noradrenaline-modulated heuristics.

Magda Dubois,Tobias U Hauser,Jochen Michely,Ray J Dolan,Rani Moran,Johanna Habicht

doi:10.7554/elife.59907

Magda Dubois, Tobias U Hauser + Show 4 more

Open Access

https://doi.org/10.7554/elife.59907

Copy DOI

Abstract

An exploration-exploitation trade-off, the arbitration between sampling a lesser-known against a known rich option, is thought to be solved using computationally demanding exploration algorithms. Given known limitations in human cognitive resources, we hypothesised the presence of additional cheaper strategies. We examined for such heuristics in choice behaviour where we show this involves a value-free random exploration, that ignores all prior knowledge, and a novelty exploration that targets novel options alone. In a double-blind, placebo-controlled drug study, assessing contributions of dopamine (400 mg amisulpride) and noradrenaline (40 mg propranolol), we show that value-free random exploration is attenuated under the influence of propranolol, but not under amisulpride. Our findings demonstrate that humans deploy distinct computationally cheap exploration strategies and that value-free random exploration is under noradrenergic control.

Highlights

Chocolate, Toblerone, spinach, or hibiscus ice-cream? Do you go for the flavour you like the most, or another one? In such an exploration-exploitation dilemma, you need to decide whether to go for the option with the highest known subjective value or opt instead for less known or valued options so as to not miss out on possibly even higher rewards
We developed a novel multi-round three-armed bandit task (Figure 1; bandits depicted as trees), enabling us to assess the contributions of value-free random exploration and novelty exploration in addition to Thompson sampling and Upper Confidence Bound (UCB)
The novelty exploration assigns a ‘novelty bonus’ only to bandits for which subjects have no prior information, but not to other bandits. This can be seen as a low-resolution version of UCB, which assigns a bonus to all choice options proportionally to how informative they are, in effect a graded bonus which scales to each bandit’s uncertainty

Summary

Introduction

In such an exploration-exploitation dilemma, you need to decide whether to go for the option with the highest known subjective value (exploitation) or opt instead for less known or valued options (exploration) so as to not miss out on possibly even higher rewards In the latter case, you can opt to either choose an option that you have previously enjoyed (Toblerone), an option you are curious about because you do not know what to expect (hibiscus), or even an option that you have disliked in the past (spinach). A common approach to the study of complex decision making, for example an explorationexploitation trade-off, is to take computational algorithms developed in the field of artificial intelligence and test whether key signatures of these are evident in human behaviour This approach has revealed humans use strategies that reflect an implementation of computationally demanding exploration algorithms (Gershman, 2018; Schulz and Gershman, 2019).

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: eLife	Publication Date: Jan 4, 2021
Citations: 39	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Human complex exploration strategies are enriched by noradrenaline-modulated heuristics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: eLife

Lead the way for us

Similar Papers

Author response: Human complex exploration strategies are enriched by noradrenaline-modulated heuristics
Magda Dubois ... Johanna Habicht
-
Magda Dubois, et. al.Magda Dubois ... Johanna Habicht
05 Nov 2020
05 Nov 2020

The role of uncertainty in attentional and choice exploration.
Adrian R Walker ... David Luque
Psychonomic Bulletin & Review | VOL. 26
Adrian R Walker, et. al.Adrian R Walker ... David Luque
19 Aug 2019
Psychonomic Bulletin & Review | VOL. 26

Beta-Blocker Propranolol Modulates Decision Urgency During Sequential Information Gathering.
Tobias U Hauser ... Peter Dayan
The Journal of Neuroscience | VOL. 38
Tobias U Hauser, et. al.Tobias U Hauser ... Peter Dayan
13 Jul 2018
The Journal of Neuroscience | VOL. 38

48 Should I Stay or Should I Go? Neural Circuits Underlying Decisions to Explore or Exploit
Lindsay E Wyatt ... Patrick A Hewan
Journal of the International Neuropsychological Society | VOL. 29
Lindsay E Wyatt, et. al.Lindsay E Wyatt ... Patrick A Hewan
01 Nov 2023
Journal of the International Neuropsychological Society | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Human complex exploration strategies are enriched by noradrenaline-modulated heuristics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: eLife