Back to Search
Start Over
Glottal Flow Synthesis for Whisper-to-Speech Conversion
- Source :
- IEEE/ACM Transactions on Audio, Speech and Language Processing, IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2020, 28, pp.889-900. ⟨10.1109/TASLP.2020.2971417⟩
- Publication Year :
- 2020
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2020.
-
Abstract
- International audience; Whisper-to-speech conversion is motivated by laryngeal disorders, in which malfunction of the vocal folds leads to loss of voicing. Many patients with laryngeal disorders can still produce functional whispers, since these are characterised by the absence of vocal fold vibration. Whispers therefore constitute a common ground for speech rehabilitation across many kinds of laryngeal disorder. Whisper-to-speech conversion involves recreating natural-sounding speech from recorded whispers, and is a non-invasive and non-surgical rehabilitation that can maintain a natural method of speaking, unlike the existing methods of rehabilitation. This paper proposes a new rule-based method for whisper-to-speech conversion that replaces the noisy whisper sound source with a synthesised speech-like harmonic source, while maintaining the vocal tract component unaltered. In particular, a novel glottal source generator is developed in which whisper information is used to parameterise the excitation through a high-quality glottis model. Evaluation of the system against the standard pulse train excitation method reveals significantly improved performance. Since our method is glottis-based, it is potentially compatible with the many existing vocal tract component adaptation systems.
- Subjects :
- Glottis
Acoustics and Ultrasonics
Computer science
Speech recognition
Laryngectomy
Speech synthesis
computer.software_genre
01 natural sciences
Glottal Flow Model
030507 speech-language pathology & audiology
03 medical and health sciences
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
0103 physical sciences
Computer Science (miscellaneous)
medicine
[INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC]
Electrical and Electronic Engineering
010301 acoustics
Vocal source excitation
Glottal flow
Computational Mathematics
Improved performance
Laryngeal Disorder
medicine.anatomical_structure
Vocal folds
Whisper-to-speech conversion
Voice
0305 other medical science
computer
Vocal tract
Subjects
Details
- ISSN :
- 23299304 and 23299290
- Volume :
- 28
- Database :
- OpenAIRE
- Journal :
- IEEE/ACM Transactions on Audio, Speech, and Language Processing
- Accession number :
- edsair.doi.dedup.....09ab4585245ede025fb59d1e5bca234e
- Full Text :
- https://doi.org/10.1109/taslp.2020.2971417