Back to Search Start Over

A procedure for estimating gestural scores from speech acoustics.

Authors :
Nam, Hosung
Mitra, Vikramjit
Tiede, Mark
Hasegawa-Johnson, Mark
Espy-Wilson, Carol
Saltzman, Elliot
Goldstein, Louis
Source :
Journal of the Acoustical Society of America; Dec2012, Vol. 132 Issue 6, p3980-3989, 10p
Publication Year :
2012

Abstract

Speech can be represented as a constellation of constricting vocal tract actions called gestures, whose temporal patterning with respect to one another is expressed in a gestural score. Current speech datasets do not come with gestural annotation and no formal gestural annotation procedure exists at present. This paper describes an iterative analysis-by-synthesis landmark-based time-warping architecture to perform gestural annotation of natural speech. For a given utterance, the Haskins Laboratories Task Dynamics and Application (TADA) model is employed to generate a corresponding prototype gestural score. The gestural score is temporally optimized through an iterative timing-warping process such that the acoustic distance between the original and TADA-synthesized speech is minimized. This paper demonstrates that the proposed iterative approach is superior to conventional acoustically-referenced dynamic timing-warping procedures and provides reliable gestural annotation for speech datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00014966
Volume :
132
Issue :
6
Database :
Complementary Index
Journal :
Journal of the Acoustical Society of America
Publication Type :
Academic Journal
Accession number :
84122223
Full Text :
https://doi.org/10.1121/1.4763545