Multiagent hierarchical learning from demonstration

Keith Sullivan

doi:10.5591/978-1-57735-516-8/ijcai11-498

Abstract

Programming agent behaviors is a tedious task. Typically, behaviors are developed by repeated code, test, debug cycles. The difficulty increases in a multiagent setting due to the increased size of the design space. Density of interactions, the number of agents and the agent’s heterogeneity (both capabilities and behaviors) all contribute to the larger design space. This makes training the agents rather than programming them highly attractive. One training approach is Learning from Demonstration (LfD) in which agents learn behaviors in real-time based on provided examples from a human demonstrator. The learned behavior maps environmental features to agent action(s), and is constructed from a database of state/action examples supplied by the demonstrator. The database is constructed interactively: initially, the agent is in “training mode,” where the demonstrator controls the agent. Every time the demonstrator changes the agent’s behavior, the agent saves an example to the database. When the demonstrator is finished collecting examples, the agent learns the behavior, and then enters “testing mode.” The demonstrator may offer corrections to the agent based on observation. These corrections add examples to the database, and the behavior is re-learned. LfD is a natural way to train agents since it closely mimics how humans teach each other. Examples include sports, music, and physical therapy. In my LfD implementation, called Hierarchical Training of Agent Behavior (HITAB), the agents learn behaviors represented as an automaton. HITAB is a supervised machine learning approach which uses a classification algorithm to learn the transitions inside the behavior automaton. Typically, supervised machine learning requires significant data to learn robust behaviors. This is doubly so in complex, high dimensional design spaces. However, gathering data is potentially expensive since each data point requires an experiment (physically or in simulation) conducted in real-time. HITAB’s behavior representation helps reduce the number of required samples by decomposing the task into smaller, less complex tasks. In addition, these smaller tasks might require a reduced set of behaviors and/or features, thus further reducing the design space. Hence, HITAB rapidly learns complex behaviors which are simple from a machine learning perspective. While this places HITAB at the edge of machine learning, it allows novices to train an agent to perform complex behaviors without requiring detailed programming knowledge. 1 Single Agent Model

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multiagent hierarchical learning from demonstration

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Robotic learning of force-based industrial manipulation tasks

-

27 Jun 2019
27 Jun 2019

Surveying Design Spaces with Performance Maps - A Multivariate Visualization Method for Parametric Design and Architectural Design Optimization
Thomas Wortmann
-
Thomas WortmannThomas Wortmann
01 Jan 2015
01 Jan 2015

Supervised Rule Learning and Reinforcement Learning in A Multi-Agent System for the Fish Banks Game

-

01 Jan 2009
01 Jan 2009

Exploiting Structure for Scalable and Robust Deep Learning

-

01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multiagent hierarchical learning from demonstration

Abstract

Talk to us

Similar Papers