Back to Search
Start Over
POS-Constrained Parallel Decoding for Non-autoregressive Generation
- Source :
- ACL/IJCNLP (1)
- Publication Year :
- 2021
- Publisher :
- Association for Computational Linguistics, 2021.
-
Abstract
- The multimodality problem has become a major challenge of existing non-autoregressive generation (NAG) systems. A common solution often resorts to sequence-level knowledge distillation by rebuilding the training dataset through autoregressive generation (hereinafter known as “teacher AG”). The success of such methods may largely depend on a latent assumption, i.e., the teacher AG is superior to the NAG model. However, in this work, we experimentally reveal that this assumption does not always hold for the text generation tasks like text summarization and story ending generation. To provide a feasible solution to the multimodality problem of NAG, we propose incorporating linguistic structure (Part-of-Speech sequence in particular) into NAG inference instead of relying on teacher AG. More specifically, the proposed POS-constrained Parallel Decoding (POSPD) method aims at providing a specific POS sequence to constrain the NAG model during decoding. Our experiments demonstrate that POSPD consistently improves NAG models on four text generation tasks to a greater extent compared to knowledge distillation. This observation validates the necessity of exploring the alternatives for sequence-level knowledge distillation.
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Accession number :
- edsair.doi...........45556a5107d7d7e8dd16b31e3560ae59
- Full Text :
- https://doi.org/10.18653/v1/2021.acl-long.467