Back to Search Start Over

Spectro-temporal modulation energy based mask for robust speaker identification.

Authors :
Chi, Tai-Shih
Lin, Ting-Han
Hsu, Chung-Chien
Source :
Journal of the Acoustical Society of America; May2012, Vol. 131 Issue 5, pEL368-EL374, 7p
Publication Year :
2012

Abstract

Spectro-temporal modulations of speech encode speech structures and speaker characteristics. An algorithm which distinguishes speech from non-speech based on spectro-temporal modulation energies is proposed and evaluated in robust text-independent closed-set speaker identification simulations using the TIMIT and GRID corpora. Simulation results show the proposed method produces much higher speaker identification rates in all signal-to-noise ratio (SNR) conditions than the baseline system using mel-frequency cepstral coefficients. In addition, the proposed method also outperforms the system, which uses auditory-based nonnegative tensor cepstral coefficients [Q. Wu and L. Zhang, 'Auditory sparse representation for robust speaker recognition based on tensor structure,' EURASIP J. Audio, Speech, Music Process. 2008, 578612 (2008)], in low SNR (≤ 10 dB) conditions. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00014966
Volume :
131
Issue :
5
Database :
Complementary Index
Journal :
Journal of the Acoustical Society of America
Publication Type :
Academic Journal
Accession number :
74978970
Full Text :
https://doi.org/10.1121/1.3697534