Back to Search Start Over

The TRANSCOMP Dataset of Literary Translations from 120 Languages and a Parallel Collection of English-language Originals

Authors :
Matt Erlin
Andrew Piper
Douglas Knox
Stephen Pentecost
Allie Blank
Source :
Journal of Open Humanities Data, Vol 8 (2022)
Publication Year :
2022
Publisher :
Ubiquity Press, 2022.

Abstract

The TRANSCOMP Dataset of Literary Translations is a collection of document-level word frequencies sampled from 10,631 translations into English of global literary fiction published since 1950, together with a historically matched parallel corpus of 10,682 fictional works originally published in English. We provide CSV files with word frequency counts for 10,000-word samples taken from each text. The associated metadata is available in a separate CSV. These data will be useful to literary scholars and linguists working in translation studies, and those interested in the linguistic, stylistic, and thematic specificity of translations from particular regions.

Details

Language :
English
ISSN :
2059481X
Volume :
8
Database :
Directory of Open Access Journals
Journal :
Journal of Open Humanities Data
Publication Type :
Academic Journal
Accession number :
edsdoj.370f45f60a242a2bfa3e761cfb3d31b
Document Type :
article
Full Text :
https://doi.org/10.5334/johd.94