Evolutionary Policy Transfer and Search Methods for Boosting Behavior Quality: RoboCup Keep-Away Case Study

Geoff Nitschke,Sabre Didi

doi:10.3389/frobt.2017.00062

Abstract

This study evaluates various evolutionary search methods to direct neural controller evolution in company with policy (behavior) transfer across increasingly complex collective robotic (RoboCup keep-away) tasks. That is, where robot behaviors are first evolved in a source task and then transferred for further evolution to more complex target tasks. Evolutionary search methods tested include objective-based search (fitness function), behavioral and genotypic diversity maintenance and hybrids of such diversity maintenance and objective-based search. Evolved behavior quality is evaluated according to effectiveness and efficiency. Effectiveness is the average task performance of transferred and evolved behaviors, where task performance is the average time the ball is controlled by a keeper team. Efficiency is the average number of generations taken for the fittest evolved behaviors to reach a minimum task performance threshold given policy transfer. Results indicate that policy transfer coupled with evolutionary search directed by hybridized behavioral diversity maintenance and objective-based search addresses the bootstrapping problem for increasingly complex keep-away tasks, in that this method evolves collective behaviors that could not be evolved by comparative evolutionary methods (with and without policy transfer).

Highlights

To address this study’s research objective (Section 1) and investigate the impact of objective (Section 3.3) versus non-objective (Sections 3.4–3.5) based search on the evolution of collective behaviors transferred to increasingly complex keep-away tasks, we present results demonstrating comparative method effectiveness and efficiency
To elucidate that a hybrid evolutionary search approach combining objective and non-objective-based search is most suitable for efficiently evolving effective behavioral solutions to increasingly complex collective behavior tasks
To support a hypothesis that policy transfer coupled with evolutionary search is a consistently suitable method for boosting the effectiveness and efficiency of evolved solution quality across increasingly complex tasks

Summary

Introduction

Recent work in Evolutionary Robotics (ER) (Doncieux et al, 2015) has provided increasing empirical evidence that maintaining diversity in phenotypes (robot behaviors) improves the quality (task performance) of evolved behaviors (Mouret and Doncieux, 2012; Cully et al, 2015; Cully and Mouret, 2016; Gomes et al, 2016). In controller design in the field of ER, there has been an increasing research and empirical data indicating that non-objective evolutionary search, such as novelty search (Lehman and Stanley, 2011a) and other behavioral diversity maintenance approaches (Mouret and Doncieux, 2012), Evolutionary Policy Transfer and Keep-Away out-perform objective-based search in various evolutionary robotic control tasks defined by complex, high dimensional, and deceptive fitness landscapes (Cully et al, 2015; Cully and Mouret, 2016; Gomes et al, 2016). Transferring knowledge learned on a source task accelerates learning and increases solution quality in target tasks by exploiting relevant prior knowledge

Methods

Results

Discussion

Conclusion