1. An analysis of model-based Interval Estimation for Markov Decision Processes
- Author
-
Michael L. Littman and Alexander L. Strehl
- Subjects
Markov kernel ,Computer science ,business.industry ,Computer Networks and Communications ,Variable-order Markov model ,Applied Mathematics ,Interval estimation ,Partially observable Markov decision process ,Markov process ,Markov model ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,symbols.namesake ,Computational Theory and Mathematics ,symbols ,Reinforcement learning ,Artificial intelligence ,Markov decision process ,business ,computer - Abstract
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents a theoretical analysis of MBIE and a new variation called MBIE-EB, proving their efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less ''online'' cousins from the literature.
- Published
- 2008
- Full Text
- View/download PDF