Back to Search Start Over

A Framework and Benchmark for Deep Batch Active Learning for Regression

Authors :
Holzmüller, David
Zaverkin, Viktor
Kästner, Johannes
Steinwart, Ingo
Publication Year :
2022
Publisher :
arXiv, 2022.

Abstract

The acquisition of labels for supervised learning can be expensive. In order to improve the sample-efficiency of neural network regression, we study active learning methods that adaptively select batches of unlabeled data for labeling. We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations and selection methods. Our framework encompasses many existing Bayesian methods based on Gaussian Process approximations of neural networks as well as non-Bayesian methods. Additionally, we propose to replace the commonly used last-layer features with sketched finite-width Neural Tangent Kernels, and to combine them with a novel clustering method. To evaluate different methods, we introduce an open-source benchmark consisting of 15 large tabular regression data sets. Our proposed method outperforms the state-of-the-art on our benchmark, scales to large data sets, and works out-of-the-box without adjusting the network architecture or training code. We provide open-source code that includes efficient implementations of all kernels, kernel transformations, and selection methods, and can be used for reproducing our results.<br />Comment: Changes in v3: Improvements in writing and other minor changes. Accompanying code can be found at https://github.com/dholzmueller/bmdal_reg

Details

Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....f40cede759fb1f27b9308f6cba0b15f2
Full Text :
https://doi.org/10.48550/arxiv.2203.09410