Back to Search Start Over

Finite state multi-armed bandit problems: sensitive-discount, average-reward and average-overtaking optimality

Authors :
Michael N. Katehakis
Uriel G. Rothblum
Source :
Ann. Appl. Probab. 6, no. 3 (1996), 1024-1034
Publication Year :
1996
Publisher :
The Institute of Mathematical Statistics, 1996.

Abstract

We express Gittins indices for multi-armed bandit problems as Laurent expansions around discount factor 1. The coefficients of these expan-sions are then used to characterize stationary optimal policies when the optimality criteria are sensitive-discount optimality (otherwise known as Blackwell optimality), average-reward optimality and average-overtaking optimality. We also obtain bounds and derive optimality conditions for policies of a type that continue playing the same bandit as long as the state of that bandit remains in prescribed sets.

Details

Language :
English
Database :
OpenAIRE
Journal :
Ann. Appl. Probab. 6, no. 3 (1996), 1024-1034
Accession number :
edsair.doi.dedup.....6629df64e3cf6d22a04dbb321d9d2d99