Start Over

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

Authors :: Wang, Peng
Li, Zexi
Zhang, Ningyu
Xu, Ziwen
Yao, Yunzhi
Jiang, Yong
Xie, Pengjun
Huang, Fei
Chen, Huajun
Publication Year :: 2024
Abstract: Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (non-parametric knowledge of neural network activations/representations by retrieval) will result in an impossible triangle -- reliability, generalization, and locality can not be realized together in the lifelong editing settings. For long-term memory, directly editing the parameters will cause conflicts with irrelevant pretrained knowledge or previous edits (poor reliability and locality). For working memory, retrieval-based activations can hardly make the model understand the edits and generalize (poor generalization). Therefore, we propose WISE to bridge the gap between memories. In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge. We only edit the knowledge in the side memory and train a router to decide which memory to go through when given a query. For continual editing, we devise a knowledge-sharding mechanism where different sets of edits reside in distinct subspaces of parameters, and are subsequently merged into a shared memory without conflicts. Extensive experiments show that WISE can outperform previous model editing methods and overcome the impossible triangle under lifelong model editing of question answering, hallucination, and out-of-distribution settings across trending LLM architectures, e.g., GPT, LLaMA, and Mistral. Code will be released at https://github.com/zjunlp/EasyEdit.<br />Comment: Work in progress

Subjects :: Computer Science - Computation and Language
Computer Science - Artificial Intelligence
Computer Science - Computer Vision and Pattern Recognition
Computer Science - Information Retrieval
Computer Science - Machine Learning

Details

Database :: arXiv
Publication Type :: Report
Accession number :: edsarx.2405.14768
Document Type :: Working Paper

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources