Back to Search
Start Over
Cross-linguistic authorship attribution and gender profiling. Machine translation as a method for bridging the language gap.
- Source :
-
Digital Scholarship in the Humanities . Sep2024, Vol. 39 Issue 3, p954-967. 14p. - Publication Year :
- 2024
-
Abstract
- This study explores the feasibility of cross-linguistic authorship attribution and the author's gender identification using Machine Translation (MT). Computational stylistics experiments were conducted on a Greek blog corpus translated into English using Google's Neural MT. A Random Forest algorithm was employed for authorship and gender profiling, using different feature groups [Author's Multilevel N-gram Profiles, quantitative linguistics (QL), and cross-lingual word embeddings (CLWE)] in both original and translated texts. Results indicate that MT is a viable method for converting a multilingual corpus into one language for authorship attribution and gender profiling research, with considerable accuracy when training and testing datasets use identical language. In the pure cross-linguistic scenario, higher accuracies than the baselines were obtained using CLWE and QL features. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 2055768X
- Volume :
- 39
- Issue :
- 3
- Database :
- Academic Search Index
- Journal :
- Digital Scholarship in the Humanities
- Publication Type :
- Academic Journal
- Accession number :
- 179512336
- Full Text :
- https://doi.org/10.1093/llc/fqae028