This paper describes and evaluates the performance of a learning classifier system (lcs) inspired algorithm called Temporal Reinforcement And Classification Architecture (traca) on maze navigation tasks which contain hidden state. The evaluation of traca includes comparisons with other learning algorithms on selected difficult maze navigation tasks. Not all lcss are capable of learning all types of hidden-state mazes so traca is specifically compared against selected other lcs-based approaches that are most capable on these tasks, including xcsmh, AgentP (G), and AgentP (SA). Each algorithm is evaluated using a maze navigation task that has been identified as among the most difficult due to recurring aliased regions. The comparisons between algorithms include training time, test performance, and the size of the learned rule sets. The results indicate that each algorithm has its own advantages and drawbacks. For example, on the most difficult maze traca’s average steps to the goal are 10.1 while AgentP (G) are 7.87; however, traca requires an average of only 354 training trials compared with 537 for AgentP (G). Following the maze tasks, traca is also tested on two variations in a truck driving task where it must learn to navigate four lanes of slower vehicles while avoiding collisions. The results show that traca can achieve a low number of collisions with relatively few trials (as low as 24 collisions over 5000 time steps after 10,000 training time steps) but may require multiple network construction attempts to achieve high performance.
Read full abstract