Back to Search Start Over

THE COMPLEXITY OF OPTIMAL SMALL POLICIES.

Authors :
Mundhenk, Martin
Source :
Mathematics of Operations Research; Feb2000, Vol. 25 Issue 1, p118, 12p, 3 Diagrams
Publication Year :
2000

Abstract

The article investigates the complexity of problems concerned with partially-observable Markov decision processes which are run for a finite number of steps under small policies. Partially-observable Markov decision processes (POMDP) model sequential decision-making when outcomes are uncertain and the state of the system cannot be completely observed. The optimal performance of a POMDP under any small policy of a given size can be calculated using a binary search method. This involves creating a series of new POMDPs from the given one by appending an initial state such that any action from that state yields a positive or negative reward equal to the shift from the last value considered. All classes contain polynomial time and are contained in polynomial space. Other classes that the authors mention are nondeterministic logarithmic space and probabilistic logarithmic space, which are defined like NP and complexity class PP, but the polynomial time bound is replaced by a logarithmic space bound.

Details

Language :
English
ISSN :
0364765X
Volume :
25
Issue :
1
Database :
Complementary Index
Journal :
Mathematics of Operations Research
Publication Type :
Academic Journal
Accession number :
3032365
Full Text :
https://doi.org/10.1287/moor.25.1.118.15214