EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

Authors :: Gu, Yuxian
Wen, Jiaxin
Sun, Hao
Song, Yi
Ke, Pei
Zheng, Chujie
Zhang, Zheng
Yao, Jianzhu
Liu, Lei
Zhu, Xiaoyan
Huang, Minlie
Publication Year :: 2022
Abstract: Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.<br />Comment: Machine Intelligence Research. https://link.springer.com/article/10.1007/s11633-022-1387-3 . 12 pages, 5 figures. The code and pre-trained models are publicly available at https://github.com/thu-coai/EVA

Subjects :: Computer Science - Computation and Language
Computer Science - Artificial Intelligence

Full Text Access

Tools