Abstract

Human learning and intelligence work differently from the supervised pattern recognition approach adopted in most deep learning architectures. Humans seem to learn rich representations by exploration and imitation, build causal models of the world, and use both to flexibly solve new tasks. We suggest a simple but effective unsupervised model which develops such characteristics. The agent learns to represent the dynamical physical properties of its environment by intrinsically motivated exploration and performs inference on this representation to reach goals. For this, a set of self-organizing maps which represent state-action pairs is combined with a causal model for sequence prediction. The proposed system is evaluated in the cartpole environment. After an initial phase of playful exploration, the agent can execute kinematic simulations of the environment's future and use those for action planning. We demonstrate its performance on a set of several related, but different one-shot imitation tasks, which the agent flexibly solves in an active inference style.

Highlights

  • During the last decade, rapid progress in the field of deep learning has led to a number of remarkable achievements in many fields of artificial intelligence (AI) [1]

  • We suggest a very simple neural architecture, which learns in completely unsupervised fashion and incorporates several of the mentioned principles: it learns a model of the dynamics of its environment by playful exploration (“intuitive physics”), can play virtual, predicted episodes (“what could be true and is not”), and can plan action sequences to bring the environment closer to a target state that has been given extrinsically by a one-shot demonstration (“planning actions to make it so”)

  • In the Method and Results section thereafter, we provide a proof of concept using the Open AI’s gym cartpole environment. e section is subdivided into three parts: First, state-actionprediction self-organizing maps” (SapSom) is trained by playfully exploring the cartpole environment using random action sequences

Read more

Summary

Introduction

Rapid progress in the field of deep learning has led to a number of remarkable achievements in many fields of artificial intelligence (AI) [1]. The model allows addressing the second and third issues by being able to actively explore the causal structure of its environment, to learn intuitive physics, and to plan actions in order to achieve goals.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.