1. Overview of the Neural Network Compression and Representation (NNR) Standard
- Author
-
Hamed Rezazadegan-Tavakoli, Wojciech Samek, Werner Bailer, Paul Haase, Karsten Muller, Swayambhoo Jain, Francesco Cricri, Miska Hannuksela, Shan Liu, Emre Aksu, Wei Jiang, Shahab Hamidi-Rad, Fabien Racape, Heiner Kirchhoffer, and Wei Wang
- Subjects
Artificial neural network ,Computer science ,Quantization (signal processing) ,Encoding (memory) ,Media Technology ,Data_CODINGANDINFORMATIONTHEORY ,Pruning (decision trees) ,Electrical and Electronic Engineering ,Representation (mathematics) ,Bitstream format ,Algorithm ,Decoding methods ,Coding (social sciences) - Abstract
Neural Network Coding and Representation (NNR) is the first international standard for efficient compression of neural networks (NNs). The standard is designed as a toolbox of compression methods, which can be used to create coding pipelines. It can be either used as an independent coding framework (with its own bitstream format) or together with external neural network formats and frameworks. For providing the highest degree of flexibility, the network compression methods operate per parameter tensor in order to always ensure proper decoding, even if no structure information is provided. The NNR standard contains compression-efficient quantization and deep context-adaptive binary arithmetic coding (DeepCABAC) as core encoding and decoding technologies, as well as neural network parameter pre-processing methods like sparsification, pruning, low-rank decomposition, unification, local scaling and batch norm folding. NNR achieves a compression efficiency of more than 97% for transparent coding cases, i.e. without degrading classification quality, such as top-1 or top-5 accuracies. This paper provides an overview of the technical features and characteristics of NNR.
- Published
- 2022
- Full Text
- View/download PDF