101. Moving average multi directional local features for speaker recognition
- Author
-
Ghulam Muhammad, Esam Othman, Habib Dhahri, Awais Mahmood, Mohammed Faisal, and Mansour Alsulaiman
- Subjects
Computer Networks and Communications ,Computer science ,Speech recognition ,Feature extraction ,020206 networking & telecommunications ,02 engineering and technology ,Speaker recognition ,Formant ,Moving average ,Linear regression ,Multi directional ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Mel-frequency cepstrum ,Software - Abstract
A new speech feature extraction technique called moving average multi directional local features (MA-MDLF) is presented in this paper. This method is based on linear regression (LR) and moving average (MA) in the time–frequency plane. Three-point LR is taken along time axis and frequency axis, and 3 points MA is taken along 45° and 135° in the time–frequency plane. The LR captures the voice onset\offset, formant contour, while the moving average captures the dynamics on time–frequency axes which can be seen as voiceprints. The MA-MDLF performance is compared to commonly used speech features in speaker recognition. The comparison is performed in a speaker recognition system (SRS) for three different conditions, namely clean speech, mobile speech, and cross channel. MA-MDLF has shown better performance than the baseline MFCC, RASTA-PLP and LPCC. In clean and mobile speech, MA-MDLF feature performs the best and also in the cross channel task MA-MDLF performed excellent. We also evaluated the MA-MDLF using three speech databases, namely KSU, LDC Babylon and TIMITdatabases, and found that MA-MDLF outperformed the other commonly used features with speech from all the three databases. The first and second databases are for Arabic speech while third is for English speech.
- Published
- 2018