The problem considered in this article is how a cybernetic system can learn to control its actions in a hostile environment. This article focuses on an approach to solving this problem in an environment with varying temperatures. In effect, machines that operate outdoors have higher survivability if actions are chosen during periods when it is cooler (e.g., night-time or early morning rather than mid- to late afternoon during summer months). The assumption made here is that learning to choose actions that compensate for the influence of temperature has beneficial influence on the functioning of individuals in robot societies (collections of cooperating robots called swarmbots or swarms). In keeping with this idea, a biologically-inspired form of adaptive learning is given in this article. Conventional actor-critic learning provides a framework for the control strategy introduced in this article. It is ethology (study of behavior of organisms) that provides a basis for monitoring the behavior of a swarmbot. Individual behaviours together with sensor measurements are recorded in tables called ethograms. Swarm behavior tends to be episodic. An ethogram is recorded during each episode during the lifespan of a swarm. Each ethogram is a source of measurements that can be used to influence learning during an episode. The contribution of this article is the introduction of a biologically-inspired approach to learning that adapts to changing temperatures.
Read full abstract