Start Over

Learning and decisions in contextual multi-armed bandit tasks

Authors :: Schulz, Erich
Schulz, Erich
Konstantinidis, Emmanouil
Speekenbrink, Maarten
Schulz, Erich
Schulz, Erich
Konstantinidis, Emmanouil
Speekenbrink, Maarten
Source :: Proceedings of the Annual Meeting of the Cognitive Science Society; vol 37, iss 0
Publication Year :: 2015
Abstract: Contextual Multi-Armed Bandit (CMAB) tasks are a novel framework to assess decision making in uncertain environments. In a CMAB task, participants are presented with multiple options (arms) which are characterized by a number of features (context) related to the reward as- sociated with the arms. By choosing arms repeatedly and observing the reward, participants can learn about the relation between context and reward and improve their decision strategy. We present two studies on how people behave in CMAB tasks. Within a stationary en- vironment, we ?nd that participants are best described by Thompson Sampling-based Gaussian Process mod- els. This decision rule incorporates probability match- ing to the expected outcomes derived from a rational model of the task and it is especially well-adapted to non-stationary environments. In a dynamic CMAB task we again ?nd that participants are best described by probability matching of Gaussian Process expectations. Our ?ndings imply that behavior previously referred to as \irrational" can actually be seen as a well-adapted strategy based on powerful inference algorithms. Keywords: Decision Making, Learning, Exploration- Exploitation, Contextual Multi-Armed Bandits

Details

Database :: OAIster
Journal :: Proceedings of the Annual Meeting of the Cognitive Science Society; vol 37, iss 0
Notes :: application/pdf, Proceedings of the Annual Meeting of the Cognitive Science Society vol 37, iss 0
Publication Type :: Electronic Resource
Accession number :: edsoai.on1449587781
Document Type :: Electronic Resource