Back to Search Start Over

Incorporation of Manner of Articulation Constraint in LSTM for Speech Recognition.

Authors :
Pradeep, R.
Rao, K. Sreenivasa
Source :
Circuits, Systems & Signal Processing. Aug2019, Vol. 38 Issue 8, p3482-3500. 19p.
Publication Year :
2019

Abstract

The variants of recurrent neural networks such as long short-term memory (LSTM) and gated recurrent unit are successful in sequence modelling such as automatic speech recognition. However, the decoded sequence is prune to have false substitutions, insertions and deletions. In our work, we investigate the outcome of the hidden layers in LSTM trained on TIMIT dataset. We found interestingly that the first hidden layer was capturing information related to some broad manners of articulation. The successive hidden layers try to cluster among the broad manners of articulation. We detected two broad manners of articulation, namely sonorants (vowels, semi-vowels, nasals) and obstruents (fricatives, stops, affricates) by exploiting the spectral flatness measure (SFM) on the linear prediction coefficients. We define a additional gate called manner of articulation gate that is high if the broad manners of articulation of tth frame are same as that of (t + 1) th frame. The manner of articulation detection is embedded at the output of the activation gate of LSTM at the first hidden layer. By doing so, the sonorants being substituted as obstruents are minimized at the output layer. The proposed method decreased the phone error rates by 0.7% when evaluated on the core test set of the TIMIT. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0278081X
Volume :
38
Issue :
8
Database :
Academic Search Index
Journal :
Circuits, Systems & Signal Processing
Publication Type :
Academic Journal
Accession number :
137338109
Full Text :
https://doi.org/10.1007/s00034-019-01074-5