Back to Search
Start Over
Enhancing Model Parallelism in Neural Architecture Search for Multidevice System.
- Source :
-
IEEE Micro . Sep/Oct2020, Vol. 40 Issue 5, p46-55. 10p. - Publication Year :
- 2020
-
Abstract
- Neural architecture search (NAS) finds favorable network topologies for better task performance. Existing hardware-aware NAS techniques only target to reduce inference latency on single CPU/GPU systems and the searched model can hardly be parallelized. To address this issue, we propose ColocNAS, the first synchronization-aware, end-to-end NAS framework that automates the design of parallelizable neural networks for multidevice systems while maintaining a high task accuracy. ColocNAS defines a new search space with elaborated connectivity to reduce device communication and synchronization. ColocNAS consists of three phases: 1) offline latency profiling that constructs a lookup table of inference latency of various networks for online runtime approximation; 2) differentiable latency-aware NAS that simultaneously minimizes inference latency and task error; and 3) reinforcement-learning-based device placement fine-tuning to further reduce the latency of the deployed model. Extensive evaluation corroborates ColocNAS's effectiveness to reduce inference latency while preserving task accuracy. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 02721732
- Volume :
- 40
- Issue :
- 5
- Database :
- Academic Search Index
- Journal :
- IEEE Micro
- Publication Type :
- Academic Journal
- Accession number :
- 145693353
- Full Text :
- https://doi.org/10.1109/MM.2020.3004538