Back to Search
Start Over
GMAT: Glottal closure instants detection based on the Multiresolution Absolute Teager–Kaiser energy operator
- Source :
- Digital Signal Processing. 69:286-299
- Publication Year :
- 2017
- Publisher :
- Elsevier BV, 2017.
-
Abstract
- Glottal Closure Instants (GCIs) detection is important to many speech applications. However, most existing algorithms cannot achieve computational efficiency and accuracy simultaneously. In this paper, we present the Glottal closure instants detection based on the Multiresolution Absolute TKEO (GMAT) that can detect GCIs with high accuracy and low computational cost. Considering the nonlinearity in speech production, the Teager–Kaiser Energy Operator (TKEO) is utilized to detect GCIs and an instant with a high absolute TKEO value often indicates a GCI. To enhance robustness, three multiscale pooling techniques, which are max pooling, multiscale product, and mean pooling, are applied to fuse absolute TKEOs of several scales. Finally, GCIs are detected based on the fused results. In the performance evaluation, GMAT is compared with three state-of-the-art methods, MSM (Most Singular Manifold-based approach), ZFR (Zero Frequency Resonator-based method), and SEDREAMS (Speech Event Detection using the Residual Excitation And a Mean-based Signal). On clean speech, experiments show that GMAT can attain higher identification rate and accuracy than MSM. Comparing with ZFR and SEDREAMS, GMAT gives almost the same reliability and higher accuracy. In addition, on noisy speech, GMAT demonstrates the highest robustness for most SNR levels. Additional comparison shows that GMAT is less sensitive to the choice of scale in multiscale processing and it has low computational cost. Finally, pathological speech identification, which is a concrete application of GCIs, is included to show the efficacy of GMAT in practice. Through this paper, we investigate the potential of TKEO for GCI detection and the proposed algorithm GMAT can detect GCIs with high accuracy and low computational cost. Due to the superiority of GMAT, it will be a promising choice for GCI detection, particularly in real-time scenarios. Hence, this work may contribute to systems relying on GCIs, where both accuracy and computational cost are crucial.
- Subjects :
- Speech production
Applied Mathematics
Speech recognition
Pooling
020206 networking & telecommunications
02 engineering and technology
Glottal closure
Residual
Energy operator
Identification rate
030507 speech-language pathology & audiology
03 medical and health sciences
Nonlinear system
Computational Theory and Mathematics
Artificial Intelligence
Robustness (computer science)
Signal Processing
0202 electrical engineering, electronic engineering, information engineering
Computer Vision and Pattern Recognition
Electrical and Electronic Engineering
Statistics, Probability and Uncertainty
0305 other medical science
Algorithm
Mathematics
Subjects
Details
- ISSN :
- 10512004
- Volume :
- 69
- Database :
- OpenAIRE
- Journal :
- Digital Signal Processing
- Accession number :
- edsair.doi...........c77328cc0765ca4045f6d43323dd047f