1. The Big Win Strategy on Multi-Value Network
- Author
-
Shun-Shii Lin, Nai-Yuan Chang, Surag Nair, and Chih-Hung Chen
- Subjects
Value (ethics) ,Focus (computing) ,Artificial neural network ,business.industry ,Computer science ,0102 computer and information sciences ,02 engineering and technology ,01 natural sciences ,Tree (data structure) ,Value network ,010201 computation theory & mathematics ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
The AlphaZero approach got a great success and achieved superhuman performance across many challenging games, but we think there are at least three problems that can be improved. Firstly, AlphaZero only estimates win, draw, or lose but ignores how many points it will get or lose. Secondly, AlphaZero uses Monte-Carlo Tree Search to derive an average value among all the children nodes' values in a subtree. Thirdly, AlphaZero does not consider the depth rewards during the Monte-Carlo Tree Search. To solve these three problems, we introduce a general-purpose framework, the Big-Best-Quick win strategy in Monte-Carlo Tree Search, to try to surpass the AlphaZero approach. In this paper, we mainly focus on the Big-win strategy to improve the performance of AlphaZero without human knowledge. We are pleased to derive some promising results in which our Big-win approach has improved the strength of the 6x6 Othello program with win rate=63%, lose rate=28%, and draw rate=9% comparing to the original AlphaZero approach based on a fair training and playing time conditions.
- Published
- 2018
- Full Text
- View/download PDF