Back to Search
Start Over
A Framework and Benchmark for Deep Batch Active Learning for Regression
- Publication Year :
- 2022
- Publisher :
- arXiv, 2022.
-
Abstract
- The acquisition of labels for supervised learning can be expensive. In order to improve the sample-efficiency of neural network regression, we study active learning methods that adaptively select batches of unlabeled data for labeling. We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations and selection methods. Our framework encompasses many existing Bayesian methods based on Gaussian Process approximations of neural networks as well as non-Bayesian methods. Additionally, we propose to replace the commonly used last-layer features with sketched finite-width Neural Tangent Kernels, and to combine them with a novel clustering method. To evaluate different methods, we introduce an open-source benchmark consisting of 15 large tabular regression data sets. Our proposed method outperforms the state-of-the-art on our benchmark, scales to large data sets, and works out-of-the-box without adjusting the network architecture or training code. We provide open-source code that includes efficient implementations of all kernels, kernel transformations, and selection methods, and can be used for reproducing our results.<br />Comment: Changes in v3: Improvements in writing and other minor changes. Accompanying code can be found at https://github.com/dholzmueller/bmdal_reg
Details
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....f40cede759fb1f27b9308f6cba0b15f2
- Full Text :
- https://doi.org/10.48550/arxiv.2203.09410