1. Balanced Arabic corpus design for speech synthesis.
- Author
-
Amrouche, Aissa, Abed, Ahcène, Ferrat, Kamel, Boubakeur, Khadidja Nesrine, Bentrcia, Youssouf, and Falek, Leila
- Subjects
HIDDEN Markov models ,CORPORA ,PHONEME (Linguistics) ,SPEECH synthesis ,ARABIC language - Abstract
This paper aims to design and validate a phonetically balanced speech corpus for Arabic language. Designing and developing a rich and phonetically balanced corpus in optimal context is one of the key issues in building high quality of text-to-speech synthesis systems. The rich characteristic is in the sense that it must contain all the possible phonemes on the right and left context, whereas the balanced characteristic is in the sense that it respects the phonetic distribution in the language. We propose a new methodology for designing and implementing such corpus for speech synthesis purposes. The paper explains the whole creation process of this corpus, beginning with the design stage, corpus creation, recording phases, and finally the segmentation of the speech corpus. The speech corpus contains 202 sentences with 6174 phonemes. In order to validate the speech corpus, an Arabic speech synthesis system using Hidden Markov Models has been developed. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF