1. Building a data-driven configuration space for automated machine learning
- Author
-
Ferreira da Costa, Pedro, Dafflon, Jessica, Lopez Pinaya, Walter Hugo, Monti, Ricardo, Smallwood, Jonathan, BzDok, Danilo, Turkheimer, Federico, Tye, Charlotte, Jones, Emily JH, Cole, James, and Leech, Robert
- Abstract
The prevalence of machine learning (ML) tools combined with the ever-growing number of prediction algorithms has created the need for efficient approaches to their selection for small-scale datasets. In this paper, we present a novel Automated Machine Learning (AutoML) solution that efficiently searches a prediction space formed from thousands of ML algorithm pipelines. We distill the high-dimensional data to a low-dimensional configuration space that is efficiently sampled through Bayesian optimization. We demonstrate how the automatically organized space serves as a strong prior to build an AutoML system and deploy it on smaller scale data. We further demonstrate how the space organization can optimize performance while minimizing computational and time resources. As a proof of principle, we apply the prediction space approach to EEG data from a prospective study of n=216 infants with and without a family history of autism.
- Published
- 2023
- Full Text
- View/download PDF