1. Toward the Predictability of Dynamic Real-Time DNN Inference
- Author
-
Mingsong Lv, Weiguang Pang, Di Liu, Wang Yi, Xu Jiang, and Teng Gao
- Subjects
Computer science ,business.industry ,Inference ,Machine learning ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,Execution time ,Constraint (information theory) ,Embedded applications ,Deep neural networks ,Artificial intelligence ,Electrical and Electronic Engineering ,Predictability ,Adaptation (computer science) ,business ,computer ,Software - Abstract
Deep neural networks (DNNs) have been widely used in many Cyber-Physical Systems (CPS). However, it is still challenging work to deploy DNNs in real-time systems. In particular, the execution time of DNN inference must be predictable, s.t. it could be known whether the run-time inference can complete within a required timing constraint. Moreover, the timing constraints may change dynamically with the run-time environment in many embedded applications, such as autonomous cars. A possible way to meet such dynamic real-time requirements is to execute different sub-networks of a DNN at run-time. However, improper construction of sub-networks may not only introduce unpredictable inference time, s.t. the real-timing constraints could be violated unexpectedly, but also has poor compatibility with the well-optimized machine learning framework (e.g., TensorFlow). In this paper, we study the predictability when executing different sub-networks of a DNN. In particular, we present a feature-wise run-time adaptation framework for DNN inference, which is implemented and validated on NVIDIA Jetson TX2 and Nano with TensorFlow. The experimental results show that our method can achieve predictable inference time in comparison with the state-of-the-art methods.
- Published
- 2022
- Full Text
- View/download PDF