Start Over

Automatic Segmentation and Identification of Mixed-Language Speech Using Delta-BIC and LSA-Based GMMs.

Authors :: Chung-Hsien Wu
Yu-Hsien Chiu
Chi-Jiun Shia
Chun-Yu Lin
Source :: IEEE Transactions on Audio, Speech & Language Processing; Jan2006, Vol. 14 Issue 1, p266-276, 11p, 4 Diagrams, 2 Charts, 6 Graphs
Publication Year :: 2006
Abstract: This paper proposes an approach to segmenting and identifying mixed-language speech. A delta Bayesian information criterion (delta-BIC) is firstly applied to segment the input speech utterance into a sequence of language-dependent segments using acoustic features. A VQ-based bi-gram model is used to characterize the acoustic-phonetic dynamics of two consecutive codewords in a language. Accordingly the language-specific acoustic-phonetic property of sequence of phones was integrated in the identification process. A Gaussian mixture model (GMM) is used to model codeword occurrence vectors orthonormally transformed using latent semantic analysis (LSA) for each language. dependent segment. A filtering method is used to smooth the hypothesized language sequence and thus eliminate noise-like components of the detected language sequence generated by the maximum likelihood estimation. Finally, a dynamic programming method is used to determine globally the language boundaries. Experimental results show that for Mandarin, English, and Taiwanese, a recall rate of 0.87 for language boundary segmentation was obtained. Based on this recall rate, the proposed approach achieved language identification accuracies of 92.1% and 74.9% for single-language and mixed-language speech, respectively. [ABSTRACT FROM AUTHOR]

Subjects :: SPEECH
MIXED languages
LANGUAGE & languages
SPEECH perception
SEMANTICS
DYNAMIC programming

Details

Language :: English
ISSN :: 15587916
Volume :: 14
Issue :: 1
Database :: Complementary Index
Journal :: IEEE Transactions on Audio, Speech & Language Processing
Publication Type :: Academic Journal
Accession number :: 23172990
Full Text :: https://doi.org/10.1109/TSA.2005.852992

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Automatic Segmentation and Identification of Mixed-Language Speech Using Delta-BIC and LSA-Based GMMs.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Automatic Segmentation and Identification of Mixed-Language Speech Using Delta-BIC and LSA-Based GMMs.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources