Start Over

Moving average multi directional local features for speaker recognition.

Authors :: Mahmood, Awais
Muhammad, Ghulam
Alsulaiman, Mansour
Dhahri, Habib
Othman, Esam M. Asem
Faisal, Mohammed
Source :: Cluster Computing. Jan2019 Supplement 1, Vol. 22 Issue 1, p2145-2157. 13p.
Publication Year :: 2019
Abstract: A new speech feature extraction technique called moving average multi directional local features (MA-MDLF) is presented in this paper. This method is based on linear regression (LR) and moving average (MA) in the time–frequency plane. Three-point LR is taken along time axis and frequency axis, and 3 points MA is taken along 45° and 135° in the time–frequency plane. The LR captures the voice onset\offset, formant contour, while the moving average captures the dynamics on time–frequency axes which can be seen as voiceprints. The MA-MDLF performance is compared to commonly used speech features in speaker recognition. The comparison is performed in a speaker recognition system (SRS) for three different conditions, namely clean speech, mobile speech, and cross channel. MA-MDLF has shown better performance than the baseline MFCC, RASTA-PLP and LPCC. In clean and mobile speech, MA-MDLF feature performs the best and also in the cross channel task MA-MDLF performed excellent. We also evaluated the MA-MDLF using three speech databases, namely KSU, LDC Babylon and TIMITdatabases, and found that MA-MDLF outperformed the other commonly used features with speech from all the three databases. The first and second databases are for Arabic speech while third is for English speech. [ABSTRACT FROM AUTHOR]