Back to Search
Start Over
A Unified Analysis of Stochastic Momentum Methods for Deep Learning
- Source :
- IJCAI
- Publication Year :
- 2018
- Publisher :
- International Joint Conferences on Artificial Intelligence Organization, 2018.
-
Abstract
- Stochastic momentum methods have been widely adopted in training deep neural networks. However, their theoretical analysis of convergence of the training objective and the generalization error for prediction is still under-explored. This paper aims to bridge the gap between practice and theory by analyzing the stochastic gradient (SG) method, and the stochastic momentum methods including two famous variants, i.e., the stochastic heavy-ball (SHB) method and the stochastic variant of Nesterov?s accelerated gradient (SNAG) method. We propose a framework that unifies the three variants. We then derive the convergence rates of the norm of gradient for the non-convex optimization problem, and analyze the generalization performance through the uniform stability approach. Particularly, the convergence analysis of the training objective exhibits that SHB and SNAG have no advantage over SG. However, the stability analysis shows that the momentum term can improve the stability of the learned model and hence improve the generalization performance. These theoretical insights verify the common wisdom and are also corroborated by our empirical analysis on deep learning.
- Subjects :
- Momentum (technical analysis)
Optimization problem
Computer science
business.industry
Generalization
Deep learning
Stability (learning theory)
02 engineering and technology
Term (time)
Convergence (routing)
0202 electrical engineering, electronic engineering, information engineering
Deep neural networks
Applied mathematics
020201 artificial intelligence & image processing
Artificial intelligence
business
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
- Accession number :
- edsair.doi...........b5a8dd51dbd935eb491cf48d3a9699c1
- Full Text :
- https://doi.org/10.24963/ijcai.2018/410