Back to Search
Start Over
Speaker-adaptive-trainable Boltzmann machine and its application to non-parallel voice conversion
- Source :
- EURASIP Journal on Audio, Speech, and Music Processing, Vol 2017, Iss 1, Pp 1-10 (2017)
- Publication Year :
- 2017
- Publisher :
- SpringerOpen, 2017.
-
Abstract
- Abstract In this paper, we present a voice conversion (VC) method that does not use any parallel data while training the model. Voice conversion is a technique where only speaker-specific information in the source speech is converted while keeping the phonological information unchanged. Most of the existing VC methods rely on parallel data—pairs of speech data from the source and target speakers uttering the same sentences. However, the use of parallel data in training causes several problems: (1) the data used for the training is limited to the pre-defined sentences, (2) the trained model is only applied to the speaker pair used in the training, and (3) a mismatch in alignment may occur. Although it is generally preferable in VC to not use parallel data, a non-parallel approach is considered difficult to learn. In our approach, we realize the non-parallel training based on speaker-adaptive training (SAT). Speech signals are represented using a probabilistic model based on the Boltzmann machine that defines phonological information and speaker-related information explicitly. Speaker-independent (SI) and speaker-dependent (SD) parameters are simultaneously trained using SAT. In the conversion stage, a given speech signal is decomposed into phonological and speaker-related information, the speaker-related information is replaced with that of the desired speaker, and then voice-converted speech is obtained by combining the two. Our experimental results showed that our approach outperformed the conventional non-parallel approach regarding objective and subjective criteria.
Details
- Language :
- English
- ISSN :
- 16874722
- Volume :
- 2017
- Issue :
- 1
- Database :
- Directory of Open Access Journals
- Journal :
- EURASIP Journal on Audio, Speech, and Music Processing
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.0d5fd99fdb6c473faa1308fe95508165
- Document Type :
- article
- Full Text :
- https://doi.org/10.1186/s13636-017-0112-6