Revisiting tri-training of dependency parsers

Authors :: Joachim Wagner
Jennifer Foster
Source :: Wagner, Joachim ORCID: 0000-0002-8290-3849 and Foster, Jennifer ORCID: 0000-0002-7789-4853 (2021) Revisiting tri-training of dependency parsers. In: 2021 Conference on Empirical Methods in Natural Language Processing, 7-11 Nov 2021, Online and Punta Cana, Dominican Republic.
Publication Year :: 2021
Publisher :: Association for Computational Linguistics (ACL), 2021.
Abstract: We compare two orthogonal semi-supervised learning techniques, namely tri-training and pretrained word embeddings, in the task of dependency parsing. We explore language-specific FastText and ELMo embeddings and multilingual BERT embeddings. We focus on a low resource scenario as semi-supervised learning can be expected to have the most impact here. Based on treebank size and available ELMo models, we select Hungarian, Uyghur (a zero-shot language for mBERT) and Vietnamese. Furthermore, we include English in a simulated low-resource setting. We find that pretrained word embeddings make more effective use of unlabelled data than tri-training but that the two approaches can be successfully combined.<br />Comment: 17 pages, 1 figure, to be published at EMNLP 2021

Subjects :: FOS: Computer and information sciences
Computer Science - Computation and Language
Machine learning
Computational linguistics
Computation and Language (cs.CL)

Language :: English
Database :: OpenAIRE
Journal :: Wagner, Joachim ORCID: 0000-0002-8290-3849 <https://orcid.org/0000-0002-8290-3849> and Foster, Jennifer ORCID: 0000-0002-7789-4853 <https://orcid.org/0000-0002-7789-4853> (2021) Revisiting tri-training of dependency parsers. In: 2021 Conference on Empirical Methods in Natural Language Processing, 7-11 Nov 2021, Online and Punta Cana, Dominican Republic.
Accession number :: edsair.doi.dedup.....09845675b56c5e8295803e602b56ad66

Tools