Back to Search Start Over

Variance-Reducing Couplings for Random Features

Authors :
Reid, Isaac
Markou, Stratis
Choromanski, Krzysztof
Turner, Richard E.
Weller, Adrian
Publication Year :
2024

Abstract

Random features (RFs) are a popular technique to scale up kernel methods in machine learning, replacing exact kernel evaluations with stochastic Monte Carlo estimates. They underpin models as diverse as efficient transformers (by approximating attention) to sparse spectrum Gaussian processes (by approximating the covariance function). Efficiency can be further improved by speeding up the convergence of these estimates: a variance reduction problem. We tackle this through the unifying lens of optimal transport, finding couplings to improve RFs defined on both Euclidean and discrete input spaces. They enjoy theoretical guarantees and sometimes provide strong downstream gains, including for scalable approximate inference on graphs. We reach surprising conclusions about the benefits and limitations of variance reduction as a paradigm, showing that other properties of the coupling should be optimised for attention estimation in efficient transformers.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2405.16541
Document Type :
Working Paper