Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Authors :: Wayne Xin Zhao
Peiyu Liu
Ji-Rong Wen
Zhong-Yi Lu
Ze-Feng Gao
Zhi-Yuan Xie
Source :: ACL/IJCNLP (1)
Publication Year :: 2021
Publisher :: Association for Computational Linguistics, 2021.
Abstract: This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics. It can decompose an original matrix into central tensors (containing the core information) and auxiliary tensors (with only a small proportion of parameters). With the decomposed MPO structure, we propose a novel fine-tuning strategy by only updating the parameters from the auxiliary tensors, and design an optimization algorithm for MPO-based approximation over stacked network architectures. Our approach can be applied to the original or the compressed PLMs in a general way, which derives a lighter network and significantly reduces the parameters to be fine-tuned. Extensive experiments have demonstrated the effectiveness of the proposed approach in model compression, especially the reduction in finetuning parameters (91% reduction on average).<br />Comment: Accepted by ACL 2021 main conference

Subjects :: FOS: Computer and information sciences
Computer Science - Machine Learning
Quantum Physics
Fine-tuning
Network architecture
Computer Science - Artificial Intelligence
Computer science
FOS: Physical sciences
Matrix multiplication
Machine Learning (cs.LG)
Reduction (complexity)
Matrix (mathematics)
Artificial Intelligence (cs.AI)
Operator (computer programming)
Compression (functional analysis)
Language model
Quantum Physics (quant-ph)
Algorithm

Database :: OpenAIRE
Journal :: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Accession number :: edsair.doi.dedup.....a8f4606631bbcd7d58e31c7fad0f6efc

Tools