XCS with eligibility traces

Authors :: Jan Drugowitsch
Alwyn M. Barry
Source :: GECCO
Publication Year :: 2005
Publisher :: ACM, 2005.
Abstract: The development of the XCS Learning Classifier System has produced a robust and stable implementation that performs competitively in direct-reward environments. Although investigations in delayed-reward (i.e. multi-step) environments have shown promise, XCS still struggles to efficiently find optimal solutions in environments with long action-chains. This paper highlights the strong relation of XCS to reinforcement learning and identifies some of the major differences. This makes it possible to add Eligibility Traces to XCS, a method taken from reinforcement learning to update the prediction of the whole action-chain on each step, which should cause prediction update to be faster and more accurate. However, it is shown that the discrete nature of the condition representation of a classifier and the operation of the genetic algorithm cause traces to propagate back incorrect prediction values and in some cases results in a decrease of system performance. As a result further investigation of the existing approach to generalisation is proposed.

Subjects :: Learning classifier system
Computer science
business.industry
Genetic algorithm
Q-learning
Reinforcement learning
Artificial intelligence
Temporal difference learning
Machine learning
computer.software_genre
business
Classifier (UML)
computer

Database :: OpenAIRE
Journal :: Proceedings of the 7th annual conference on Genetic and evolutionary computation
Accession number :: edsair.doi...........a55b588a4a626e155b5c4cf33481efea
Full Text :: https://doi.org/10.1145/1068009.1068322