Back to Search
Start Over
LLR: Learning learning rates by LSTM for training neural networks
- Source :
- Neurocomputing. 394:41-50
- Publication Year :
- 2020
- Publisher :
- Elsevier BV, 2020.
-
Abstract
- In the training process of the deep neural networks, the learning rate plays an important role in whether the training process can converge and how fast it can achieve converge. In order to ensure convergence, most of the existing optimization methods adopt a multi-stage descending small learning rate which is hand-designed. However, this method converges slowly especially in the early stage of training. Based on this, a learning rate adjustment strategy that can automatically adjust and has a faster speed of loss decline will be helpful to the training of the deep model. In this paper, a dynamic adjustment strategy of learning rate is developed based on the Long Short Term Memory(LSTM) model and the gradients of loss function. This method effectively utilizes the advantages of the LSTM model considering the multi-step learning rate as a whole, and generates the learning rate of the current step based on the memory information of the previous learning rate. Three datasets and four architectures are used in the experiments. We applied the learning rate adjustment method to various optimization methods and achieved good results that our method can achieve even smaller loss under the same number of iterations.
- Subjects :
- 0209 industrial biotechnology
Artificial neural network
Computer science
business.industry
Cognitive Neuroscience
Process (computing)
Training (meteorology)
02 engineering and technology
Function (mathematics)
Machine learning
computer.software_genre
Computer Science Applications
020901 industrial engineering & automation
Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
Subjects
Details
- ISSN :
- 09252312
- Volume :
- 394
- Database :
- OpenAIRE
- Journal :
- Neurocomputing
- Accession number :
- edsair.doi...........b879da5ae366e6746cbf6a62de88d3b1