Back to Search
Start Over
A real-time French text-to-speech system generating high-quality synthetic speech
- Source :
- ICASSP
- Publication Year :
- 2002
- Publisher :
- IEEE, 2002.
-
Abstract
- The main features of the CNET diphone-based text-to-speech system for French language are described. The linguistic analysis works in three steps. First, a morphosyntactic analysis module assigns a grammatical value to each word in the text and transcribes it phonetically. A second module parses the text into hierarchical syntactico-prosodic groups. Finally, prosodic patterns are automatically assigned to each word by queries to a database of prosodic events. The phonetic and prosodic information serves as commands to the synthesis component. The synthesis component is based on diphone concatenation. A time-domain formulation of the pitch-synchronous overlap-add scheme (TD-PSOLA) is used to modify the speech prosody and to concatenate diphone waveforms. It is combined with a low bit-rate speech decoder to reduce the memory requirement for storing the diphone inventory. The system runs in real time on a PC equipped with a TMS320C25 DSP board and provides notably improved sound quality and naturalness in comparison to commercially available systems. >
- Subjects :
- business.industry
Computer science
Speech recognition
Concatenation
Speech corpus
Speech synthesis
Diphone
computer.software_genre
Speech processing
ComputingMethodologies_PATTERNRECOGNITION
MBROLA
Artificial intelligence
Sound quality
business
Prosody
computer
Natural language processing
Word (computer architecture)
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- International Conference on Acoustics, Speech, and Signal Processing
- Accession number :
- edsair.doi...........8abdfb63df675013846ce34d8f039b62
- Full Text :
- https://doi.org/10.1109/icassp.1990.115650