Back to Search Start Over

Compression of models and data in deep learning

Compression of models and data in deep learning

Authors :
Alizadeh, M
Markham, A
Han, S
Lane, N
Gal, Y
Publication Year :
2023

Abstract

We face many challenges in deploying high-performance neural networks in practice. These challenges are predominantly due to the size of neural networks and apply to both training and inference. Compressing neural networks to make them train and run more efficiently is therefore crucial and has been a parallel line of research from the early days of neural networks development. The two main compression techniques in deep learning, which are the focus of this thesis, are pruning and quantization. This thesis explores how the information from higher-order gradients (meta-gradients) be used to improve deep learning compression. We start by identifying a fundamental limitation in the formulation of pruning: Although many methods, such as saliency-based pruning, follow pruning by a training or fine-tuning stage, parameter saliencies only look at a snapshot of parameters without taking into account the "trainability" of the parameters. We show how meta-gradients can be used as a more informative signal to find better trainable subnetworks at initialization. We then look at quantized neural networks and show how meta-gradients can be used in a regularization scheme to "learn" models with inherent robustness against post-training quantization. Finally, we look at the dual compression problem, i.e. using neural networks to compress data sources. We start with images and propose a simple autoencoder-free architecture where we store weights of a neural network instead of RGB values of image pixels. We then use meta-gradients to meta-learn a base network to amortize the cost of training one network per input. A significant advantage of our learning compression is that it becomes agnostic to the data type, and we present results on various data types beyond 2D images. Importantly, we evaluate the usefulness of standard DNN compression techniques, e.g., quantization, for this new type of neural network.

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.od......1064..d6199ea7e31bc79e7f692ada03ceb585