Back to Search Start Over

EvoLP: Self-Evolving Latency Predictor for Model Compression in Real-Time Edge Systems.

Authors :
Huai, Shuo
Kong, Hao
Li, Shiqing
Luo, Xiangzhong
Subramaniam, Ravi
Makaya, Christian
Lin, Qian
Liu, Weichen
Source :
IEEE Embedded Systems Letters; Jun2024, Vol. 16 Issue 2, p174-177, 4p
Publication Year :
2024

Abstract

Edge devices are increasingly utilized for deploying deep learning applications on embedded systems. The real-time nature of many applications and the limited resources of edge devices necessitate latency-targeted neural network compression. However, measuring latency on real devices is challenging and expensive. Therefore, this letter presents a novel and efficient framework, named EvoLP, to accurately predict the inference latency of models on edge devices. This predictor can evolve to achieve higher latency prediction precision during the network compression process. Experimental results demonstrate that EvoLP outperforms previous state-of-the-art approaches by being evaluated on three edge devices and four model variants. Moreover, when incorporated into a model compression framework, it effectively guides the compression process for higher model accuracy while satisfying strict latency constraints. We open-source EvoLP at https://github.com/ntuliuteam/EvoLP. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19430663
Volume :
16
Issue :
2
Database :
Complementary Index
Journal :
IEEE Embedded Systems Letters
Publication Type :
Academic Journal
Accession number :
177558615
Full Text :
https://doi.org/10.1109/LES.2023.3321599