Back to Search
Start Over
Vocal Tract Length Estimation Using Accumulated Means of Formants and Its Effects on Speaker-Normalization
- Source :
- IEEE/ACM Transactions on Audio, Speech, and Language Processing. 29:1049-1064
- Publication Year :
- 2021
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2021.
-
Abstract
- Differences in vocal tract lengths (VTLs) in individual speakers cause variations in acoustic features of phonemes. In this paper, a simple method to estimate speaker-specific VTLs and to quantitatively evaluate some speaker-normalization effects of the VTLs is proposed. We employed accumulated means of formant trajectories to estimate the VTLs of speakers ranging from children to adults. For the formant estimation, the inverse-filter control (IFC) system was used. In the system, the decision of analysis order, which means number of formants to be estimated, is automated. Moreover, to evaluate the speaker-normalization effect of VTLs, we proposed the data reduction method, which can reasonably find dense areas of ellipses from distributions in the formant space. Using these ellipse areas, we evaluated the three normalization effects of VTLs: normalization by the mean of all VTLs as the standard, by speaker-categorical means of VTLs, and by individual VTLs. The area reduced from the standard area of the original data by 39.5% and 46.6% in the case of the categorical means and individual VTLs, respectively. As a result, our proposed method was used to provide a “normalized vowel map (NVM)” that visualizes universal vowel-distributions as a core image of linguistic information. Finally, we compared the estimated VTLs with those by another method based on magnetic resonance imaging (MRI) data, using the proposed methods.
- Subjects :
- Normalization (statistics)
Acoustics and Ultrasonics
business.industry
Ranging
Pattern recognition
Ellipse
030507 speech-language pathology & audiology
03 medical and health sciences
Computational Mathematics
Formant
Vowel
Computer Science (miscellaneous)
Artificial intelligence
Electrical and Electronic Engineering
0305 other medical science
business
Categorical variable
Vocal tract
Data reduction
Mathematics
Subjects
Details
- ISSN :
- 23299304 and 23299290
- Volume :
- 29
- Database :
- OpenAIRE
- Journal :
- IEEE/ACM Transactions on Audio, Speech, and Language Processing
- Accession number :
- edsair.doi...........599438376fdb73ecb8b814a580c811d7
- Full Text :
- https://doi.org/10.1109/taslp.2021.3060172