Back to Search
Start Over
Convergence-Aware Neural Network Training
- Source :
- DAC
- Publication Year :
- 2020
- Publisher :
- IEEE, 2020.
-
Abstract
- Training a deep neural network(DNN) is expensive, requiring a large amount of computation time. While the training overhead is high, not all computation in DNN training is equal. Some parameters converge faster and thus their gradient computation may contribute little to the parameter update; in nearstationary points a subset of parameters may change very little. In this paper we exploit the parameter convergence to optimize gradient computation in DNN training. We design a light-weight monitoring technique to track the parameter convergence; we prune the gradient computation stochastically for a group of semantically related parameters, exploiting their convergence correlations. These techniques are efficiently implemented in existing GPU kernels. In our evaluation the optimization techniques substantially and robustly improve the training throughput for four DNN models on three public datasets.
- Subjects :
- 020203 distributed computing
Artificial neural network
Computer science
Computation
Training (meteorology)
02 engineering and technology
010501 environmental sciences
01 natural sciences
Convergence (routing)
0202 electrical engineering, electronic engineering, information engineering
Overhead (computing)
Algorithm
Throughput (business)
0105 earth and related environmental sciences
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2020 57th ACM/IEEE Design Automation Conference (DAC)
- Accession number :
- edsair.doi...........ac7cf0bb040a5867f6f7f945d748946a
- Full Text :
- https://doi.org/10.1109/dac18072.2020.9218518