1. Translation Quality and Error Recognition in Professional Neural Machine Translation Post-Editing
- Author
-
Silvia Hansen-Schirra, Moritz Schaeffer, and Jennifer Vardaro
- Subjects
revision ,Machine translation ,Hjerson ,Computer Networks and Communications ,Computer science ,post-editing effort ,02 engineering and technology ,computer.software_genre ,Keystroke logging ,01 natural sciences ,Terminology ,010104 statistics & probability ,Annotation ,key-logging ,0202 electrical engineering, electronic engineering, information engineering ,0101 mathematics ,error annotations ,eye-tracking ,lcsh:T58.5-58.64 ,lcsh:Information technology ,business.industry ,MQM ,Communication ,Fixation (psychology) ,European Commission (DGT) ,neural machine translation ,Human-Computer Interaction ,Workflow ,post-editing ,Eye tracking ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Sentence ,Natural language processing - Abstract
This study aims to analyse how translation experts from the German department of the European Commission&rsquo, s Directorate-General for Translation (DGT) identify and correct different error categories in neural machine translated texts (NMT) and their post-edited versions (NMTPE). The term translation expert encompasses translator, post-editor as well as revisor. Even though we focus on neural machine-translated segments, translator and post-editor are used synonymously because of the combined workflow using CAT-Tools as well as machine translation. Only the distinction between post-editor, which refers to a DGT translation expert correcting the neural machine translation output, and revisor, which refers to a DGT translation expert correcting the post-edited version of the neural machine translation output, is important and made clear whenever relevant. Using an automatic error annotation tool and the more fine-grained manual error annotation framework to identify characteristic error categories in the DGT texts, a corpus analysis revealed that quality assurance measures by post-editors and revisors of the DGT are most often necessary for lexical errors. More specifically, the corpus analysis showed that, if post-editors correct mistranslations, terminology or stylistic errors in an NMT sentence, revisors are likely to correct the same error type in the same post-edited sentence, suggesting that the DGT experts were being primed by the NMT output. Subsequently, we designed a controlled eye-tracking and key-logging experiment to compare participants&rsquo, eye movements for test sentences containing the three identified error categories (mistranslations, terminology or stylistic errors) and for control sentences without errors. We examined the three error types&rsquo, effect on early (first fixation durations, first pass durations) and late eye movement measures (e.g., total reading time and regression path durations). Linear mixed-effects regression models predict what kind of behaviour of the DGT experts is associated with the correction of different error types during the post-editing process.
- Published
- 2019
- Full Text
- View/download PDF