Back to Search
Start Over
Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks
- Source :
- Emelin, D, Titov, I & Sennrich, R 2020, Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks . in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) . pp. 7635–7653, The 2020 Conference on Empirical Methods in Natural Language Processing, Virtual conference, 16/11/20 . https://doi.org/10.18653/v1/2020.emnlp-main.616, EMNLP (1)
- Publication Year :
- 2020
-
Abstract
- Word sense disambiguation is a well-known source of translation errors in NMT. We posit that some of the incorrect disambiguation choices are due to models' over-reliance on dataset artifacts found in training data, specifically superficial word co-occurrences, rather than a deeper understanding of the source text. We introduce a method for the prediction of disambiguation errors based on statistical data properties, demonstrating its effectiveness across several domains and model types. Moreover, we develop a simple adversarial attack strategy that minimally perturbs sentences in order to elicit disambiguation errors to further probe the robustness of translation models. Our findings indicate that disambiguation robustness varies substantially between domains and that different models trained on the same data are vulnerable to different attacks.<br />Comment: Accepted to EMNLP 2020
- Subjects :
- FOS: Computer and information sciences
Training set
Word-sense disambiguation
Computer Science - Computation and Language
Machine translation
business.industry
Computer science
410 Linguistics
02 engineering and technology
000 Computer science, knowledge & systems
16. Peace & justice
computer.software_genre
Adversarial system
020204 information systems
10105 Institute of Computational Linguistics
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Source text
Artificial intelligence
business
computer
Computation and Language (cs.CL)
Natural language processing
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- Emelin, D, Titov, I & Sennrich, R 2020, Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks . in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) . pp. 7635–7653, The 2020 Conference on Empirical Methods in Natural Language Processing, Virtual conference, 16/11/20 . https://doi.org/10.18653/v1/2020.emnlp-main.616, EMNLP (1)
- Accession number :
- edsair.doi.dedup.....ff53e7c96dd47168c0c69de6002e7c4c