Start Over

Bayesian Optimization Allowing for Common Random Numbers.

Authors :: Pearce, Michael Arthur Leopold
Poloczek, Matthias
Branke, Juergen
Source :: Operations Research; Nov/Dec2022, Vol. 70 Issue 6, p3457-3472, 16p, 1 Chart, 5 Graphs
Publication Year :: 2022
Abstract: More Efficient Bayesian Optimization Through the Use of Common Random Numbers Bayesian optimization is a powerful tool for expensive stochastic black-box optimization problems, such as simulation-based optimization or hyperparameter tuning in machine learning systems. In "Bayesian Optimization Allowing for Common Random Numbers," Pearce, Poloczek, and Branke show how explicitly modeling the random seed in the Gaussian process surrogate model allows Bayesian optimization to exploit the structure in the noise and benefit from variance reduction provided by common random numbers. The proposed knowledge gradient with common random numbers acquisition function iteratively determines a combination of input and random seed to evaluate the objective. It automatically trades off reusing old seeds to benefit from the variance reduction through common random numbers and querying new seeds to avoid bias because of a small number of seeds. The proposed algorithm is analyzed theoretically and empirically shows superior performance compared with previous approaches on various test problems. Bayesian optimization is a powerful tool for expensive stochastic black-box optimization problems, such as simulation-based optimization or machine learning hyperparameter tuning. Many stochastic objective functions implicitly require a random number seed as input. By explicitly reusing a seed, a user can exploit common random numbers, comparing two or more inputs under the same randomly generated scenario, such as a common customer stream in a job shop problem or the same random partition of training data into training and validation sets for a machine learning algorithm. With the aim of finding an input with the best average performance over infinitely many seeds, we propose a novel Gaussian process model that jointly models both the output for each seed and the average over seeds. We then introduce the knowledge gradient for common random numbers that iteratively determines a combination of input and a random seed to evaluate the objective and automatically trades off reusing old seeds and querying new seeds, thus overcoming the need to evaluate inputs in batches or measure differences of pairs as suggested in previous methods. We investigate the knowledge gradient for common random numbers both theoretically and empirically, finding that it achieves significant performance improvements with only moderate added computational cost. Funding: The first author gratefully acknowledges funding through the UK Engineering and Physical Sciences Research Council [Grant EP/101358X/1]. Supplemental Material: The e-companion is available at https://doi.org/10.1287/opre.2021.2208. [ABSTRACT FROM AUTHOR]