Back to Search Start Over

POS-Constrained Parallel Decoding for Non-autoregressive Generation

Authors :
Kexin Yang
Jiancheng Lv
Dayiheng Liu
Weizhen Qi
Wenqiang Lei
Source :
ACL/IJCNLP (1)
Publication Year :
2021
Publisher :
Association for Computational Linguistics, 2021.

Abstract

The multimodality problem has become a major challenge of existing non-autoregressive generation (NAG) systems. A common solution often resorts to sequence-level knowledge distillation by rebuilding the training dataset through autoregressive generation (hereinafter known as “teacher AG”). The success of such methods may largely depend on a latent assumption, i.e., the teacher AG is superior to the NAG model. However, in this work, we experimentally reveal that this assumption does not always hold for the text generation tasks like text summarization and story ending generation. To provide a feasible solution to the multimodality problem of NAG, we propose incorporating linguistic structure (Part-of-Speech sequence in particular) into NAG inference instead of relying on teacher AG. More specifically, the proposed POS-constrained Parallel Decoding (POSPD) method aims at providing a specific POS sequence to constrain the NAG model during decoding. Our experiments demonstrate that POSPD consistently improves NAG models on four text generation tasks to a greater extent compared to knowledge distillation. This observation validates the necessity of exploring the alternatives for sequence-level knowledge distillation.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Accession number :
edsair.doi...........45556a5107d7d7e8dd16b31e3560ae59
Full Text :
https://doi.org/10.18653/v1/2021.acl-long.467