1. Reinforcement learning and the effects of parameter settings in the game of Chung Toi
- Author
-
Christopher J. Gatti, Mark J. Embrechts, and Jonathan D. Linton
- Subjects
Artificial neural network ,business.industry ,Computer science ,Reinforcement learning ,Artificial intelligence ,business ,Temporal difference learning ,Transfer function - Abstract
This work applied reinforcement learning and the temporal difference TD(λ) algorithm to train a neural network to play the game of Chung Toi, a challenging variant of Tic-Tac-Toe. The effects of changing parameters and settings of the TD(λ) and of the neural network were evaluated by observing the ability of the network to learn the game of Chung Toi and play against a ‘smart’ random player. This work applied techniques that have proven effective in training neural networks in general to the TD(λ) algorithm. The basic implementation of the TD(λ) method resulted in stable performance and achieved a maximal performance of winning 90.4% of evaluation games. When changing parameter settings, the best performance was achieved by using different learning rates between layers in the neural network (92.6% wins), and this was followed by using a relatively high probability of action exploitation (91.8% wins).
- Published
- 2011
- Full Text
- View/download PDF