1. Benchmarking the Performance of Bayesian Optimization across Multiple Experimental Materials Science Domains
- Author
-
Armi Tiihonen, John W. Fisher, Benji Maruyama, Aldair E. Gongora, Kedar Hippalgaonkar, Flore Mekki-Berrada, Tonio Buonassisi, Zhe Liu, Saif A. Khan, Qiaohao Liang, Daniil Bash, Shijing Sun, Zekun Ren, Keith A. Brown, and James R. Deneault
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Active learning (machine learning) ,FOS: Physical sciences ,Machine learning ,computer.software_genre ,Field (computer science) ,Machine Learning (cs.LG) ,QA76.75-76.765 ,symbols.namesake ,Surrogate model ,General Materials Science ,Computer software ,Materials of engineering and construction. Mechanics of materials ,Gaussian process ,Condensed Matter - Materials Science ,business.industry ,Bayesian optimization ,Materials Science (cond-mat.mtrl-sci) ,Function (mathematics) ,Computer Science Applications ,Random forest ,Range (mathematics) ,Mechanics of Materials ,Modeling and Simulation ,Physics - Data Analysis, Statistics and Probability ,TA401-492 ,symbols ,Artificial intelligence ,business ,computer ,Data Analysis, Statistics and Probability (physics.data-an) - Abstract
In the field of machine learning (ML) for materials optimization, active learning algorithms, such as Bayesian Optimization (BO), have been leveraged for guiding autonomous and high-throughput experimentation systems. However, very few studies have evaluated the efficiency of BO as a general optimization algorithm across a broad range of experimental materials science domains. In this work, we evaluate the performance of BO algorithms with a collection of surrogate model and acquisition function pairs across five diverse experimental materials systems, namely carbon nanotube polymer blends, silver nanoparticles, lead-halide perovskites, as well as additively manufactured polymer structures and shapes. By defining acceleration and enhancement metrics for general materials optimization objectives, we find that for surrogate model selection, Gaussian Process (GP) with anisotropic kernels (automatic relevance detection, ARD) and Random Forests (RF) have comparable performance and both outperform the commonly used GP without ARD. We discuss the implicit distributional assumptions of RF and GP, and the benefits of using GP with anisotropic kernels in detail. We provide practical insights for experimentalists on surrogate model selection of BO during materials optimization campaigns.
- Published
- 2021