AUGNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation

Authors :: Xu, Xinnuo
Wang, Guoyin
Kim, Young-Bum
Lee, Sungjin
Source :: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL2021)
Publication Year :: 2021
Abstract: Natural Language Generation (NLG) is a key component in a task-oriented dialogue system, which converts the structured meaning representation (MR) to the natural language. For large-scale conversational systems, where it is common to have over hundreds of intents and thousands of slots, neither template-based approaches nor model-based approaches are scalable. Recently, neural NLGs started leveraging transfer learning and showed promising results in few-shot settings. This paper proposes AUGNLG, a novel data augmentation approach that combines a self-trained neural retrieval model with a few-shot learned NLU model, to automatically create MR-to-Text data from open-domain texts. The proposed system mostly outperforms the state-of-the-art methods on the FewShotWOZ data in both BLEU and Slot Error Rate. We further confirm improved results on the FewShotSGD data and provide comprehensive analysis results on key components of our system. Our code and data are available at https://github.com/XinnuoXu/AugNLG.

Database :: arXiv
Journal :: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL2021)
Publication Type :: Report
Accession number :: edsarx.2106.05589
Document Type :: Working Paper

Tools