Using the XCS Classifier System for Multi-objective Reinforcement Learning Problems

Matthew Studley,Larry Bull

doi:10.1162/artl.2007.13.1.69

Abstract

We investigate the performance of a learning classifier system in some simple multi-objective, multi-step maze problems, using both random and biased action-selection policies for exploration. Results show that the choice of action-selection policy can significantly affect the performance of the system in such environments. Further, this effect is directly related to population size, and we relate this finding to recent theoretical studies of learning classifier systems in single-step problems.

Highlights

This article is organized : In Section 2 we briefly introduce some related work, and in Section 3 we give an overview of the XCS classifier system
XCS Classifier System for Multi-objective Reinforcement Learning Problems using a random action-selection policy, and half using a deterministic policy. Since it is our intention in these investigations to aim always at the use of XCS for multi-objective problems performed by physical agents in the real world, we first investigate the performance of XCS using the traditional 50/50 random exploration, and investigate its performance using roulette-wheel action selection during the exploration trials
We have found that roulette-wheel exploration can produce comparable results to that achieved using a random action selection policy at the expense of increasing N

Summary

Introduction

The needs of the moment may override longer-term objectives: the need for food is less than the need to avoid being eaten, the need for shelter may overcome the drive to mate, and so on Balancing these conflicting drives is the difference between success and failure — between life and death — and the optimization of balancing behaviors is subject to great evolutionary pressure. XCS Classifier System for Multi-objective Reinforcement Learning Problems using a random action-selection policy, and half using a deterministic policy. Since it is our intention in these investigations to aim always at the use of XCS for multi-objective problems performed by physical agents in the real world, we first investigate the performance of XCS using the traditional 50/50 random exploration, and investigate its performance using roulette-wheel action selection during the exploration trials.

Related Work

XCS: A Brief Description

Sequential Multi-objective Maze Problems

Concurrent Multi-objective Maze Problems

Roulette-wheel Exploration — Discussion

Findings

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Artificial Life	Publication Date: Jan 1, 2007
Citations: 25	License type: cc-by

R Discovery Prime

R Discovery Prime

Using the XCS Classifier System for Multi-objective Reinforcement Learning Problems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Artificial Life

Lead the way for us

Similar Papers

Counter example for Q-bucket-brigade under prediction problem
Atsushi Wada ... Katsunori Shimohara
-
Atsushi Wada, et. al.Atsushi Wada ... Katsunori Shimohara
25 Jun 2005
25 Jun 2005

Learning Classifier Systems for Understanding Patterns in Data
Yi Liu
-
Yi LiuYi Liu
29 Jun 2021
29 Jun 2021

Learning Classifier Systems for Understanding Patterns in Data
Yi Liu
-
Yi LiuYi Liu
29 Jun 2021
29 Jun 2021

Learning anticipatory behaviour using a delayed action classifier system
Brian Carse
-
Brian CarseBrian Carse
01 Jan 1993
01 Jan 1993

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using the XCS Classifier System for Multi-objective Reinforcement Learning Problems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Artificial Life