Back to Search Start Over

Enhancing Model Parallelism in Neural Architecture Search for Multidevice System.

Authors :
Fu, Cheng
Chen, Huili
Yang, Zhenheng
Koushanfar, Farinaz
Tian, Yuandong
Zhao, Jishen
Source :
IEEE Micro. Sep/Oct2020, Vol. 40 Issue 5, p46-55. 10p.
Publication Year :
2020

Abstract

Neural architecture search (NAS) finds favorable network topologies for better task performance. Existing hardware-aware NAS techniques only target to reduce inference latency on single CPU/GPU systems and the searched model can hardly be parallelized. To address this issue, we propose ColocNAS, the first synchronization-aware, end-to-end NAS framework that automates the design of parallelizable neural networks for multidevice systems while maintaining a high task accuracy. ColocNAS defines a new search space with elaborated connectivity to reduce device communication and synchronization. ColocNAS consists of three phases: 1) offline latency profiling that constructs a lookup table of inference latency of various networks for online runtime approximation; 2) differentiable latency-aware NAS that simultaneously minimizes inference latency and task error; and 3) reinforcement-learning-based device placement fine-tuning to further reduce the latency of the deployed model. Extensive evaluation corroborates ColocNAS's effectiveness to reduce inference latency while preserving task accuracy. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02721732
Volume :
40
Issue :
5
Database :
Academic Search Index
Journal :
IEEE Micro
Publication Type :
Academic Journal
Accession number :
145693353
Full Text :
https://doi.org/10.1109/MM.2020.3004538