Back to Search
Start Over
AdamD: Improved bias-correction in Adam
- Publication Year :
- 2021
-
Abstract
- Here I present a small update to the bias-correction term in the Adam optimizer that has the advantage of making smaller gradient updates in the first several steps of training. With the default bias-correction, Adam may actually make larger than requested gradient updates early in training. By only including the well-justified bias-correction of the second moment gradient estimate, $v_t$, and excluding the bias-correction on the first-order estimate, $m_t$, we attain these more desirable gradient update properties in the first series of steps. The default implementation of Adam may be as sensitive as it is to the hyperparameters $\beta_1, \beta_2$ partially due to the originally proposed bias correction procedure, and its behavior in early steps.<br />Comment: 8 pages, 1 figure
- Subjects :
- Computer Science - Machine Learning
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2110.10828
- Document Type :
- Working Paper