Back to Search Start Over

Search and Learn: Improving Semantic Coverage for Data-to-Text Generation

Authors :
Jolly, Shailza
Zhang, Zi Xuan
Dengel, Andreas
Mou, Lili
Publication Year :
2021

Abstract

Data-to-text generation systems aim to generate text descriptions based on input data (often represented in the tabular form). A typical system uses huge training samples for learning the correspondence between tables and texts. However, large training sets are expensive to obtain, limiting the applicability of these approaches in real-world scenarios. In this work, we focus on few-shot data-to-text generation. We observe that, while fine-tuned pretrained language models may generate plausible sentences, they suffer from the low semantic coverage problem in the few-shot setting. In other words, important input slots tend to be missing in the generated text. To this end, we propose a search-and-learning approach that leverages pretrained language models but inserts the missing slots to improve the semantic coverage. We further fine-tune our system based on the search results to smooth out the search noise, yielding better-quality text and improving inference efficiency to a large extent. Experiments show that our model achieves high performance on E2E and WikiBio datasets. Especially, we cover 98.35% of input slots on E2E, largely alleviating the low coverage problem.<br />Comment: Accepted by AAAI'22

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2112.02770
Document Type :
Working Paper