1. An empirical evaluation of interval estimation for Markov decision processes
- Author
-
Michael L. Littman and Alexander L. Strehl
- Subjects
Process (engineering) ,Computer science ,business.industry ,Decision theory ,Interval estimation ,Decision tree ,Partially observable Markov decision process ,Markov process ,Markov model ,Machine learning ,computer.software_genre ,symbols.namesake ,symbols ,Markov decision process ,Artificial intelligence ,Greedy algorithm ,business ,computer - Abstract
This work takes an empirical approach to evaluating three model-based reinforcement-learning methods. All methods intend to speed the learning process by mixing exploitation of learned knowledge with exploration of possibly promising alternatives. We consider /spl epsi/-greedy exploration, which is computationally cheap and popular, but unfocused in its exploration effort; R-Max exploration, a simplification of an exploration scheme that comes with a theoretical guarantee of efficiency; and a well-grounded approach, model-based interval estimation, that better integrates exploration and exploitation. Our experiments indicate that effective exploration can result in dramatic improvements in the observed rate of learning.
- Published
- 2005
- Full Text
- View/download PDF