Back to Search
Start Over
EvoLP: Self-Evolving Latency Predictor for Model Compression in Real-Time Edge Systems.
- Source :
- IEEE Embedded Systems Letters; Jun2024, Vol. 16 Issue 2, p174-177, 4p
- Publication Year :
- 2024
-
Abstract
- Edge devices are increasingly utilized for deploying deep learning applications on embedded systems. The real-time nature of many applications and the limited resources of edge devices necessitate latency-targeted neural network compression. However, measuring latency on real devices is challenging and expensive. Therefore, this letter presents a novel and efficient framework, named EvoLP, to accurately predict the inference latency of models on edge devices. This predictor can evolve to achieve higher latency prediction precision during the network compression process. Experimental results demonstrate that EvoLP outperforms previous state-of-the-art approaches by being evaluated on three edge devices and four model variants. Moreover, when incorporated into a model compression framework, it effectively guides the compression process for higher model accuracy while satisfying strict latency constraints. We open-source EvoLP at https://github.com/ntuliuteam/EvoLP. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 19430663
- Volume :
- 16
- Issue :
- 2
- Database :
- Complementary Index
- Journal :
- IEEE Embedded Systems Letters
- Publication Type :
- Academic Journal
- Accession number :
- 177558615
- Full Text :
- https://doi.org/10.1109/LES.2023.3321599