Abstract
It is now 30 years since John Holland presented the first implementation of his learning classifier system (LCS) framework (Holland and Reitman 1978). This ‘‘Cognitive System Level 1’’ used a genetic algorithm (Holland 1975) to learn appropriate rules of behaviour in one-dimensional, dual-objective maze navigation tasks with a form of reinforcement learning assigning utility to the rules. Holland later revised the algorithm to define what would become the standard system (Holland 1980, 1986). However, Holland’s full system was somewhat complex and practical experience found it difficult to realize the envisaged behaviour/performance (e.g., Wilson and Goldberg 1989) and interest waned. Some years later, Wilson presented the ‘‘zeroth-level’’ classifier system, ZCS (Wilson 1994) which ‘‘keeps much of Holland’s original framework but simplifies it to increase understandability and performance’’ (ibid.). But ZCS did not reach optimality in the most common reinforcement learning sense. Accordingly, Wilson introduced a form of LCS which altered the way in which rule fitness is calculated—XCS (Wilson 1995). XCS also makes the connection between LCS and temporal difference learning (Watkins 1989) explicit with, in its standard form, its ability to represent the state-action value map in a rule form thereby enabling compaction through generalization. Shortly after Holland had formulated the general framework, Stephen Smith (1980) presented a modification wherein a traditional genetic algorithm was used to design a complete set of rules. That is, Smith’s poker playing ‘‘Learning System 1’’ avoided the need to assign utility to individual rules. The subsequent years have seen a resurgence of LCSs as XCS in particular has been found able to reach optimality in a number of difficult benchmark problems. Perhaps more importantly, XCS has also begun to be applied to a number of hard real-world problems such as data mining, simulation modeling, robotics, and adaptive control (see Bull 2004 for an overview)—where excellent performance has often been achieved. A theoretical basis
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.