Start Over

Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation

Authors :: Beomsu Kim
Seokjun Seo
Seungju Han
Enkhbayar Erdenee
Buru Chang
Publication Year :: 2021
Publisher :: arXiv, 2021.
Abstract: Despite the remarkable performance of large-scale generative models in open-domain conversation, they are known to be less practical for building real-time conversation systems due to high latency. On the other hand, retrieval models could return responses with much lower latency but show inferior performance to the large-scale generative models since the conversation quality is bounded by the pre-defined response set. To take advantage of both approaches, we propose a new training method called G2R (Generative-to-Retrieval distillation) that preserves the efficiency of a retrieval model while leveraging the conversational ability of a large-scale generative model by infusing the knowledge of the generative model into the retrieval model. G2R consists of two distinct techniques of distillation: the data-level G2R augments the dialogue dataset with additional responses generated by the large-scale generative model, and the model-level G2R transfers the response quality score assessed by the generative model to the score of the retrieval model by the knowledge distillation loss. Through extensive experiments including human evaluation, we demonstrate that our retrieval-based conversation system trained with G2R shows a substantially improved performance compared to the baseline retrieval model while showing significantly lower inference latency than the large-scale generative models.<br />Comment: EMNLP21-Findings

Subjects :: FOS: Computer and information sciences
Computer Science - Computation and Language
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
Computation and Language (cs.CL)

Details

Database :: OpenAIRE
Accession number :: edsair.doi.dedup.....9188e3a2701de33732648b806c6b55d9
Full Text :: https://doi.org/10.48550/arxiv.2108.12582

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources