Start Over

Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition.

Authors :: Li, Zhi-Yi
Zhang, Wei-Qiang
Liu, Jia
Source :: Multimedia Tools & Applications; Feb2015, Vol. 74 Issue 3, p937-953, 17p
Publication Year :: 2015
Abstract: A human speaker recognition expert often observes the speech spectrogram in multiple different scales for speaker recognition, especially under the short utterance condition. Inspired by this action, this paper proposes a novel multi-resolution time frequency feature (MRTF) extraction method, which is obtained by performing a 2-Dimensional discrete cosine transform (DCT) in multi-scale on the time frequency spectrogram matrix and then selecting and combining to the final multi-scaled transformed elements. Compared to the traditional Mel-Frequency Cepstral Coefficient (MFCC) feature extraction, the proposed method can make better use of multi-resolution temporal-frequency information. Beyond this, we also proposed three complementary combination strategies of MFCC and MRTF: in feature level, in i-vector level and in score level. Comparing their performance. We found the best results are obtained by combination in i-vector level. In the three NIST 2008 Speaker Recognition Evaluation datasets, the proposed method is the most effective for improving the performance under short utterance than under long utterance. And after the combination, we can achieve an EER of 11.32 % and MinDCF of 0.054 in the 10sec-10sec trials on the male dataset, which is an absolute 3 % improvement of EER than the best reported result in this field. [ABSTRACT FROM AUTHOR]

Subjects :: VOICEPRINTS
SOUND spectrography
AUTOMATIC speech recognition
IDENTIFICATION
SECURITY systems
COMPUTER user identification

Details

Language :: English
ISSN :: 13807501
Volume :: 74
Issue :: 3
Database :: Complementary Index
Journal :: Multimedia Tools & Applications
Publication Type :: Academic Journal
Accession number :: 100953258
Full Text :: https://doi.org/10.1007/s11042-013-1705-4

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources