Back to Search Start Over

Gaussian process dynamical models for nonparametric speech representation and synthesis

Publication Year :
2012

Abstract

We propose Gaussian process dynamical models (GPDMs) as a new, nonparametric paradigm in acoustic models of speech. These use multidimensional, continuous state-spaces to overcome familiar issues with discrete-state, HMM-based speech models. The added dimensions allow the state to represent and describe more than just temporal structure as systematic differences in mean, rather than as mere correlations in a residual (which dynamic features or AR-HMMs do). Being based on Gaussian processes, the models avoid restrictive parametric or linearity assumptions on signal structure. We outline GPDM theory, and describe model setup and initialization schemes relevant to speech applications. Experiments demonstrate subjectively better quality of synthesized speech than from comparable HMMs. In addition, there is evidence for unsupervised discovery of salient speech structure.<br />QC 20120308<br />LISTA

Details

Database :
OAIster
Notes :
Henter, Gustav Eje, Frean, Marcus R., Kleijn, W. Bastiaan
Publication Type :
Electronic Resource
Accession number :
edsoai.on1233559549
Document Type :
Electronic Resource
Full Text :
https://doi.org/10.1109.ICASSP.2012.6288919