Compositional clustering in task structure learning.

Nicholas T Franklin,Michael J Frank

doi:10.1371/journal.pcbi.1006116

Abstract

Humans are remarkably adept at generalizing knowledge between experiences in a way that can be difficult for computers. Often, this entails generalizing constituent pieces of experiences that do not fully overlap, but nonetheless share useful similarities with, previously acquired knowledge. However, it is often unclear how knowledge gained in one context should generalize to another. Previous computational models and data suggest that rather than learning about each individual context, humans build latent abstract structures and learn to link these structures to arbitrary contexts, facilitating generalization. In these models, task structures that are more popular across contexts are more likely to be revisited in new contexts. However, these models can only re-use policies as a whole and are unable to transfer knowledge about the transition structure of the environment even if only the goal has changed (or vice-versa). This contrasts with ecological settings, where some aspects of task structure, such as the transition function, will be shared between context separately from other aspects, such as the reward function. Here, we develop a novel non-parametric Bayesian agent that forms independent latent clusters for transition and reward functions, affording separable transfer of their constituent parts across contexts. We show that the relative performance of this agent compared to an agent that jointly clusters reward and transition functions depends environmental task statistics: the mutual information between transition and reward functions and the stochasticity of the observations. We formalize our analysis through an information theoretic account of the priors, and propose a meta learning agent that dynamically arbitrates between strategies across task domains to optimize a statistical tradeoff.

Highlights

Compared to artificial agents, humans exhibit remarkable flexibility in our ability to rapidly, spontaneously and appropriately learn to behave in unfamiliar situations, by generalizing past experience and performing symbolic-like operations on constituent components of knowledge [1]
Based on statistics and the opportunity for generalization, the learner has to infer which environmental features should constitute the context that signals the overall task structure, and, simultaneously, which features are indicative of the specific appropriate behaviors for the inferred task structure
We first consider two minimal sets of simulations to illustrate the complementary advantages afforded by the two sorts of clustering agents depending on the statistics of the task domain, using a common set of parameters

Summary

Introduction

Humans exhibit remarkable flexibility in our ability to rapidly, spontaneously and appropriately learn to behave in unfamiliar situations, by generalizing past experience and performing symbolic-like operations on constituent components of knowledge [1]. Based on statistics and the opportunity for generalization, the learner has to infer which environmental features (stimulus dimensions, episodes, etc.) should constitute the context that signals the overall task structure, and, simultaneously, which features are indicative of the specific appropriate behaviors for the inferred task structure. This learning strategy is well captured by Bayesian nonparametric models, and neural network approximations thereof, that impose a hierarchical clustering process onto learning task structures [3, 4]. Empirical studies have provided evidence that humans spontaneously impute such hierarchical structure, which facilitates future transfer, whether or not it is immediately beneficial—and, even if it is costly—to initial learning [3,4,5]

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Computational Biology	Publication Date: Apr 19, 2018
Citations: 48	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Compositional clustering in task structure learning.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Reward-predictive representations generalize across tasks in reinforcement learning
Lucas Lehnert ... Michael L Littman
-
Lucas Lehnert, et. al.Lucas Lehnert ... Michael L Littman
15 Oct 2020
15 Oct 2020

Reward-predictive representations generalize across tasks in reinforcement learning.
Lucas Lehnert ... Michael L Littman
PLOS Computational Biology | VOL. 16
Lucas Lehnert, et. al.Lucas Lehnert ... Michael L Littman
15 Oct 2020
PLOS Computational Biology | VOL. 16

Definable Zero-Sum Stochastic Games
Jérôme Bolte ... Guillaume Vigeral
Mathematics of Operations Research | VOL. 40
Jérôme Bolte, et. al.Jérôme Bolte ... Guillaume Vigeral
14 Nov 2013
Mathematics of Operations Research | VOL. 40

Decentralized Stochastic Planning with Anonymity in Interactions
Pradeep Varakantham ... Yossiri Adulyasak
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 28
Pradeep Varakantham, et. al.Pradeep Varakantham ... Yossiri Adulyasak
21 Jun 2014
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Compositional clustering in task structure learning.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology