Back to Search Start Over

Vocal Tract Length Estimation Using Accumulated Means of Formants and Its Effects on Speaker-Normalization

Authors :
Tadashi Sakata
Naomitsu Ikeda
Akira Watanabe
Yuichi Ueda
Source :
IEEE/ACM Transactions on Audio, Speech, and Language Processing. 29:1049-1064
Publication Year :
2021
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2021.

Abstract

Differences in vocal tract lengths (VTLs) in individual speakers cause variations in acoustic features of phonemes. In this paper, a simple method to estimate speaker-specific VTLs and to quantitatively evaluate some speaker-normalization effects of the VTLs is proposed. We employed accumulated means of formant trajectories to estimate the VTLs of speakers ranging from children to adults. For the formant estimation, the inverse-filter control (IFC) system was used. In the system, the decision of analysis order, which means number of formants to be estimated, is automated. Moreover, to evaluate the speaker-normalization effect of VTLs, we proposed the data reduction method, which can reasonably find dense areas of ellipses from distributions in the formant space. Using these ellipse areas, we evaluated the three normalization effects of VTLs: normalization by the mean of all VTLs as the standard, by speaker-categorical means of VTLs, and by individual VTLs. The area reduced from the standard area of the original data by 39.5% and 46.6% in the case of the categorical means and individual VTLs, respectively. As a result, our proposed method was used to provide a “normalized vowel map (NVM)” that visualizes universal vowel-distributions as a core image of linguistic information. Finally, we compared the estimated VTLs with those by another method based on magnetic resonance imaging (MRI) data, using the proposed methods.

Details

ISSN :
23299304 and 23299290
Volume :
29
Database :
OpenAIRE
Journal :
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Accession number :
edsair.doi...........599438376fdb73ecb8b814a580c811d7
Full Text :
https://doi.org/10.1109/taslp.2021.3060172