Abstract

Programming agent behaviors is a tedious task. Typically, behaviors are developed by repeated code, test, debug cycles. The difficulty increases in a multiagent setting due to the increased size of the design space. Density of interactions, the number of agents and the agent’s heterogeneity (both capabilities and behaviors) all contribute to the larger design space. This makes training the agents rather than programming them highly attractive. One training approach is Learning from Demonstration (LfD) in which agents learn behaviors in real-time based on provided examples from a human demonstrator. The learned behavior maps environmental features to agent action(s), and is constructed from a database of state/action examples supplied by the demonstrator. The database is constructed interactively: initially, the agent is in “training mode,” where the demonstrator controls the agent. Every time the demonstrator changes the agent’s behavior, the agent saves an example to the database. When the demonstrator is finished collecting examples, the agent learns the behavior, and then enters “testing mode.” The demonstrator may offer corrections to the agent based on observation. These corrections add examples to the database, and the behavior is re-learned. LfD is a natural way to train agents since it closely mimics how humans teach each other. Examples include sports, music, and physical therapy. In my LfD implementation, called Hierarchical Training of Agent Behavior (HITAB), the agents learn behaviors represented as an automaton. HITAB is a supervised machine learning approach which uses a classification algorithm to learn the transitions inside the behavior automaton. Typically, supervised machine learning requires significant data to learn robust behaviors. This is doubly so in complex, high dimensional design spaces. However, gathering data is potentially expensive since each data point requires an experiment (physically or in simulation) conducted in real-time. HITAB’s behavior representation helps reduce the number of required samples by decomposing the task into smaller, less complex tasks. In addition, these smaller tasks might require a reduced set of behaviors and/or features, thus further reducing the design space. Hence, HITAB rapidly learns complex behaviors which are simple from a machine learning perspective. While this places HITAB at the edge of machine learning, it allows novices to train an agent to perform complex behaviors without requiring detailed programming knowledge. 1 Single Agent Model

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.