Back to Search Start Over

Footprint reduction of Concatenative Text-To-Speech synthesizers using polynomial temporal decomposition

Authors :
David Malah
Tamar Shoham
Slava Shechtman
Source :
2010 4th International Symposium on Communications, Control and Signal Processing (ISCCSP).
Publication Year :
2010
Publisher :
IEEE, 2010.

Abstract

High quality low footprint Concatenative Text-To-Speech (CTTS) synthesizers provide a persistent challenge in the field of speech processing. The spectral parameters representing the short speech segments used in the concatenation process constitute a large portion of the required memory. In this paper we propose to use a vectorial form of Polynomial Temporal Decomposition combined with jointly optimal segmentation and polynomial order selection in order to reduce the storage required for the spectral amplitude parameters by 50%, while preserving the perceptual quality of the obtained synthesized speech.

Details

Database :
OpenAIRE
Journal :
2010 4th International Symposium on Communications, Control and Signal Processing (ISCCSP)
Accession number :
edsair.doi...........58f244f63884239acfee907b2e4141f9
Full Text :
https://doi.org/10.1109/isccsp.2010.5463316