Back to Search Start Over

On the convergence of a stochastic approximation method for structured bi-level optimization

Authors :
Couellan , Nicolas
Wang , Wenjuan
Ecole Nationale de l'Aviation Civile (ENAC)
Institut de Mathématiques de Toulouse UMR5219 (IMT)
Université Toulouse Capitole (UT Capitole)
Université de Toulouse (UT)-Université de Toulouse (UT)-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse)
Institut National des Sciences Appliquées (INSA)-Université de Toulouse (UT)-Institut National des Sciences Appliquées (INSA)-Université Toulouse - Jean Jaurès (UT2J)
Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3)
Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)
Bournemouth University [Poole] (BU)
Institut National des Sciences Appliquées - Toulouse (INSA Toulouse)
Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université Toulouse 1 Capitole (UT1)
Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3)
Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)
Ecole Nationale de l'Aviation Civile ( ENAC )
Institut de Mathématiques de Toulouse UMR5219 ( IMT )
Université Toulouse 1 Capitole ( UT1 ) -Université Toulouse - Jean Jaurès ( UT2J ) -Université Toulouse III - Paul Sabatier ( UPS )
Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-PRES Université de Toulouse-Institut National des Sciences Appliquées - Toulouse ( INSA Toulouse )
Institut National des Sciences Appliquées ( INSA ) -Institut National des Sciences Appliquées ( INSA ) -Centre National de la Recherche Scientifique ( CNRS )
Bournemouth University [Poole] ( BU )
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

We analyze the convergence of stochastic gradient methods for well structured bi-level optimization problems. We address two specific cases: first when the outer objective function can be expressed as a finite sum of independent terms, and next when both the outer and inner objective functions can be expressed as finite sums of independent terms. We assume Lipschitz continuity and differentiability of both objectives as well as convexity of the inner objective and consider diminishing steps sizes. We show that, under these conditions and some other assumptions on the implicit function and the variance of the gradient errors, both methods converge in expectation to a stationary point of the problem if gradient approximations are chosen so as to satisfy a sufficient decrease condition. We also discuss the satisfaction of our assumptions in machine learning problems where these methods can be nicely applied to automatically tune hyperparameters when the loss functions are very large sums of error terms.

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.dedup.wf.001..994281fb37862094cff9a61b33e49778