Back to Search
Start Over
The Big Win Strategy on Multi-Value Network
- Source :
- Proceedings of the 2018 International Conference on Machine Learning and Machine Intelligence.
- Publication Year :
- 2018
- Publisher :
- ACM, 2018.
-
Abstract
- The AlphaZero approach got a great success and achieved superhuman performance across many challenging games, but we think there are at least three problems that can be improved. Firstly, AlphaZero only estimates win, draw, or lose but ignores how many points it will get or lose. Secondly, AlphaZero uses Monte-Carlo Tree Search to derive an average value among all the children nodes' values in a subtree. Thirdly, AlphaZero does not consider the depth rewards during the Monte-Carlo Tree Search. To solve these three problems, we introduce a general-purpose framework, the Big-Best-Quick win strategy in Monte-Carlo Tree Search, to try to surpass the AlphaZero approach. In this paper, we mainly focus on the Big-win strategy to improve the performance of AlphaZero without human knowledge. We are pleased to derive some promising results in which our Big-win approach has improved the strength of the 6x6 Othello program with win rate=63%, lose rate=28%, and draw rate=9% comparing to the original AlphaZero approach based on a fair training and playing time conditions.
- Subjects :
- Value (ethics)
Focus (computing)
Artificial neural network
business.industry
Computer science
0102 computer and information sciences
02 engineering and technology
01 natural sciences
Tree (data structure)
Value network
010201 computation theory & mathematics
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
business
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 2018 International Conference on Machine Learning and Machine Intelligence
- Accession number :
- edsair.doi...........74603cb9b76ad46a6630c5608bae78f3
- Full Text :
- https://doi.org/10.1145/3278312.3278325