Abstract
The explore-exploit dilemma occurs anytime we must choose between exploring unknown options for information and exploiting known resources for reward. Previous work suggests that people use two different strategies to solve the explore-exploit dilemma: directed exploration, driven by information seeking, and random exploration, driven by decision noise. Here, we show that these two strategies rely on different neural systems. Using transcranial magnetic stimulation to inhibit the right frontopolar cortex, we were able to selectively inhibit directed exploration while leaving random exploration intact. This suggests a causal role for right frontopolar cortex in directed, but not random, exploration and that directed and random exploration rely on (at least partially) dissociable neural systems.
Highlights
In an uncertain world adaptive behavior requires us to carefully balance the exploration of new opportunities with the exploitation of known resources
Directed exploration, which involves information seeking, can be quantified as the probability of choosing the high information option, pðhigh infoÞ in the [1 3] condition, while random exploration, which involves decision noise, can be quantified as the probability of making a mistake, or choosing the low mean reward option, pðlow meanÞ, in the [2 2] condition. Using these measures of exploration, we found that inhibiting the right frontopolar cortex (RFPC) had a significant effect on directed exploration but not random exploration (Figure 3A,B)
Using a task that is able to behaviorally dissociate these two types of exploration, we found that inhibition of RFPC caused a selective reduction in directed, but not random exploration
Summary
In an uncertain world adaptive behavior requires us to carefully balance the exploration of new opportunities with the exploitation of known resources. Recent work has suggested that humans use two distinct strategies to solve the explore-exploit dilemma: directed exploration, based on information seeking, and random exploration, based on decision noise (Wilson et al, 2014). Even though both of these strategies serve the same purpose, that is, balancing exploration and exploitation, it is likely they rely on different cognitive mechanisms. Random exploration can be implemented in a simpler fashion by using neural or environmental noise to randomize choice (Thompson, 1933)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.