Back to Search Start Over

Adaptive operator selection with dynamic multi-armed bandits

Authors :
Michèle Sebag
Álvaro Fialho
Luis DaCosta
Marc Schoenauer
Laboratoire de Recherche en Informatique (LRI)
Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)
Machine Learning and Optimisation (TAO)
Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université Paris-Sud - Paris 11 (UP11)-Laboratoire de Recherche en Informatique (LRI)
Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-CentraleSupélec
Microsoft Research - Inria Joint Centre (MSR - INRIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Microsoft Research Laboratory Cambridge-Microsoft Corporation [Redmond, Wash.]
ACM
Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
Source :
GECCO, Genetic and Evolutionary Computation Conference (GECCO), Genetic and Evolutionary Computation Conference (GECCO), ACM, Jul 2008, Atlanta, United States. pp.913-920, ⟨10.1145/1389095.1389272⟩, CIÊNCIAVITAE, Scopus-Elsevier
Publication Year :
2008
Publisher :
ACM, 2008.

Abstract

10-years Impact Award at GECCO 2018; International audience; An important step toward self-tuning Evolutionary Algorithms is to design efficient Adaptive Operator Selection procedures. Such a procedure is made of two main components: a credit assignment mechanism, that computes a reward for each operator at hand based on some characteristics of the past offspring; and an adaptation rule, that modifies the selection mechanism based on the rewards of the different operators. This paper is concerned with the latter, and proposes a new approach for it based on the well-known Multi-Armed Bandit paradigm. However, because the basic Multi-Armed Bandit methods have been developed for static frameworks, a specific Dynamic Multi-Armed Bandit algorithm is proposed, that hybridizes an optimal Multi-Armed Bandit algorithm with the statistical Page-Hinkley test, which enforces the efficient detection of changes in time series. This original Operator Selection procedure is then compared to the state-of-the-art rules known as Probability Matching and Adaptive Pursuit on several artificial scenarios, after a careful sensitivity analysis of all methods. The Dynamic Multi-Armed Bandit method is found to outperform the other methods on a scenario from the literature, while on another scenario, the basic Multi-Armed Bandit performs best.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 10th annual conference on Genetic and evolutionary computation
Accession number :
edsair.doi.dedup.....4a42ed102110fb44b8f10ecf40e772ad