Back to Search Start Over

CONVERGENCE RATE OF INCREMENTAL GRADIENT AND INCREMENTAL NEWTON METHODS.

Authors :
GÜRBÜZBALABAN, M.
OZDAGLAR, A.
PARRILO, P. A.
Source :
SIAM Journal on Optimization. 2019, Vol. 29 Issue 4, p2542-2565. 24p.
Publication Year :
2019

Abstract

The incremental gradient (IG) method is a prominent algorithm for minimizing a finite sum of smooth convex functions and is used in many contexts including large-scale data processing applications and distributed optimization over networks. It is a first-order method that processes the functions one at a time based on their gradient information. The incremental Newton method, on the other hand, is a second-order variant which additionally exploits the curvature information of the underlying functions and can therefore be faster. In this paper, we focus on the case when the objective function is strongly convex and present new convergence rate estimates for the incremental gradient and incremental Newton methods under constant and diminishing step sizes. For a decaying step-size rule ακ = R/κs with s Є (0, 1] and R > 0, we show that the distance of the IG iterates to the optimal solution converges at a rate O(1/κs) (which translates into a O (1/κ2s) rate in the suboptimality of the objective value). For s > 1/2, this improves the previous O(1/√κ) results in distances obtained for the case when functions are nonsmooth under the additional assumption that the functions are smooth. We show that to achieve the fastest O(1/κ) rate with a step size ακ = R/κs, IG needs a step-size parameter R to be a function of the strong convexity constant whereas the incremental Newton method does not. The results are based on viewing the IG method as a gradient descent method with gradient errors, developing upper bounds for the gradient error to derive inequalities that relate distances of the consecutive iterates to the optimal solution and finally applying Chung's lemmas from the stochastic approximation literature to these inequalities to determine their asymptotic behavior. In addition, we construct examples to show tightness of our rate results in terms of their dependency in k. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10526234
Volume :
29
Issue :
4
Database :
Academic Search Index
Journal :
SIAM Journal on Optimization
Publication Type :
Academic Journal
Accession number :
144836785
Full Text :
https://doi.org/10.1137/17M1147846