Back to Search Start Over

Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions.

Authors :
Wang, Mengdi
Liu, Han
Fang, Ethan
Source :
Mathematical Programming. Jan2017, Vol. 161 Issue 1/2, p419-449. 31p.
Publication Year :
2017

Abstract

Classical stochastic gradient methods are well suited for minimizing expected-value objective functions. However, they do not apply to the minimization of a nonlinear function involving expected values or a composition of two expected-value functions, i.e., the problem $$\min _x \mathbf{E}_v\left[ f_v\big (\mathbf{E}_w [g_w(x)]\big ) \right] .$$ In order to solve this stochastic composition problem, we propose a class of stochastic compositional gradient descent (SCGD) algorithms that can be viewed as stochastic versions of quasi-gradient method. SCGD update the solutions based on noisy sample gradients of $$f_v,g_{w}$$ and use an auxiliary variable to track the unknown quantity $$\mathbf{E}_w\left[ g_w(x)\right] $$ . We prove that the SCGD converge almost surely to an optimal solution for convex optimization problems, as long as such a solution exists. The convergence involves the interplay of two iterations with different time scales. For nonsmooth convex problems, the SCGD achieves a convergence rate of $$\mathcal {O}(k^{-1/4})$$ in the general case and $$\mathcal {O}(k^{-2/3})$$ in the strongly convex case, after taking k samples. For smooth convex problems, the SCGD can be accelerated to converge at a rate of $$\mathcal {O}(k^{-2/7})$$ in the general case and $$\mathcal {O}(k^{-4/5})$$ in the strongly convex case. For nonconvex problems, we prove that any limit point generated by SCGD is a stationary point, for which we also provide the convergence rate analysis. Indeed, the stochastic setting where one wants to optimize compositions of expected-value functions is very common in practice. The proposed SCGD methods find wide applications in learning, estimation, dynamic programming, etc. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00255610
Volume :
161
Issue :
1/2
Database :
Academic Search Index
Journal :
Mathematical Programming
Publication Type :
Academic Journal
Accession number :
120598599
Full Text :
https://doi.org/10.1007/s10107-016-1017-3