1. Expressive Machine Dubbing Through Phrase-level Cross-lingual Prosody Transfer
- Author
-
Swiatkowski, Jakub, Wang, Duo, Babianski, Mikolaj, Coccia, Giuseppe, Tobing, Patrick Lumban, Vipperla, Ravichander, Klimkov, Viacheslav, and Pollet, Vincent
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Speech generation for machine dubbing adds complexity to conventional Text-To-Speech solutions as the generated output is required to match the expressiveness, emotion and speaking rate of the source content. Capturing and transferring details and variations in prosody is a challenge. We introduce phrase-level cross-lingual prosody transfer for expressive multi-lingual machine dubbing. The proposed phrase-level prosody transfer delivers a significant 6.2% MUSHRA score increase over a baseline with utterance-level global prosody transfer, thereby closing the gap between the baseline and expressive human dubbing by 23.2%, while preserving intelligibility of the synthesised speech., Comment: Accepted to INTERSPEECH 2023
- Published
- 2023