Back to Search Start Over

Whisper to normal speech conversion using pitch estimated from spectrum.

Authors :
Konno, Hideaki
Kudo, Mineichi
Imai, Hideyuki
Sugimoto, Masanori
Source :
Speech Communication. Oct2016, Vol. 83, p10-20. 11p.
Publication Year :
2016

Abstract

We can perceive pitch in whispered speech, although fundamental frequency ( F 0 ) does not exist physically or phonetically due to the lack of vocal-fold vibration. This study was carried out to determine how people generate such an unvoiced pitch. We conducted experiments in which speakers uttered five whispered Japanese vowels in accordance with the pitch of a guide pure tone. From the results, we derived a multiple regression function to convert the outputs of a mel-scaled filter bank of whispered speech into the perceived pitch value. Next, using this estimated pitch value as F 0 , we constructed a system for conversion of whispered speech to normal speech. Since the pitch varies with time according to the spectral shape, it was expected that the pitch accent would be kept by this conversion. Indeed, auditory experiments demonstrated that the correctly perceived rate of Japanese word accent was increased from 55.5% to 72.0% compared with that when a constant F 0 was used. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01676393
Volume :
83
Database :
Academic Search Index
Journal :
Speech Communication
Publication Type :
Academic Journal
Accession number :
117734247
Full Text :
https://doi.org/10.1016/j.specom.2016.07.001