Back to Search
Start Over
Preventing author profiling through zero-shot multilingual back-translation
- Source :
- 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov 2021, Punta Cana, Dominica
- Publication Year :
- 2021
- Publisher :
- HAL CCSD, 2021.
-
Abstract
- Documents as short as a single sentence may inadvertently reveal sensitive information about their authors, including e.g. their gender or ethnicity. Style transfer is an effective way of transforming texts in order to remove any information that enables author profiling. However, for a number of current state-of-the-art approaches the improved privacy is accompanied by an undesirable drop in the down-stream utility of the transformed data. In this paper, we propose a simple, zero-shot way to effectively lower the risk of author profiling through multilingual back-translation using off-the-shelf translation models. We compare our models with five representative text style transfer models on three datasets across different domains. Results from both an automatic and a human evaluation show that our approach achieves the best overall performance while requiring no training data. We are able to lower the adversarial prediction of gender and race by up to $22\%$ while retaining $95\%$ of the original utility on downstream tasks.<br />Comment: Accepted to EMNLP 2021 (Main Conference), 9 pages
- Subjects :
- FOS: Computer and information sciences
Computer Science - Computation and Language
[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL]
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
02 engineering and technology
010501 environmental sciences
Computation and Language (cs.CL)
01 natural sciences
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
0105 earth and related environmental sciences
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov 2021, Punta Cana, Dominica
- Accession number :
- edsair.doi.dedup.....2f4d21814195449fc6481f5242c83f6e