1. Nonparametric Compositional Stochastic Optimization for Risk-Sensitive Kernel Learning.
- Author
-
Bedi, Amrit Singh, Koppel, Alec, Rajawat, Ketan, and Sanyal, Panchajanya
- Subjects
SUPERVISED learning ,STATISTICS ,STATISTICAL accuracy ,STOCHASTIC approximation ,NONLINEAR functions ,PARETO distribution ,STOCHASTIC dominance - Abstract
In this work, we address optimization problems where the objective function is a nonlinear function of an expected value, i.e., compositional stochastic programs. We consider the case where the decision variable is not vector-valued but instead belongs to a Reproducing Kernel Hilbert Space (RKHS), motivated by risk-aware formulations of supervised learning. We develop the first memory-efficient stochastic algorithm for this setting, which we call Compositional Online Learning with Kernels (COLK). COLK, at its core a two time-scale stochastic approximation method, addresses the facts that (i) compositions of expected value problems cannot be addressed by stochastic gradient method due to the presence of an inner expectation; and (ii) the RKHS-induced parameterization has complexity which is proportional to the iteration index which is mitigated through greedily constructed subspace projections. We provide, for the first time, a non-asymptotic tradeoff between the complexity of a function parameterization and its required convergence accuracy for both strongly convex and non-convex objectives under constant step-sizes. Experiments with risk-sensitive supervised learning demonstrate that COLK consistently converges and performs reliably even when data is full of outliers, and thus marks a step towards overfitting. Specifically, we observe a favorable tradeoff between model complexity, consistent convergence, and statistical accuracy for data associated with heavy-tailed distributions. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF