Back to Search Start Over

Progress of machine learning based automatic phoneme recognition and its prospect

Authors :
Mousumi Malakar
Ravindra B. Keskar
Source :
Speech Communication. 135:37-53
Publication Year :
2021
Publisher :
Elsevier BV, 2021.

Abstract

A phoneme is the smallest perceptually distinct sound unit that can be distinguished among words in a particular language. Every language has its own set of phonemes, and all possible words can be considered as ordered sequences of phonemes.The total number of phonemes contained in a language is always very few in comparison to the size of the vocabulary supported by the language. These facts have made phoneme recognition an attractive proposition in the entire journey of the Automatic Speech Processing (ASP) till date. As a result, the classification and recognition of phonemes are considered as the primary tasks of automatic speech recognition (ASR) systems irrespective of application domain. The dynamic nature of phonemes and several sources of their variability create lots of barriers in accurate identification of phonemes from an acoustic signal. The contribution of Machine Learning (ML) based techniques in overcoming these obstructions in automatic phoneme recognition (APR) is remarkable. Nowadays with lot of data availability, ML based ASR is preferred because of its simplicity over acoustic-phonetic based methods. The ML based techniques do not follow the conventional method based on identification of acoustic properties. Rather, ML techniques build their own trained model (algorithm) using readily available data. They do so by finding out the hidden patterns in speech signals, and acquire predictive intelligence through learning. Therefore, ML techniques can be said to provide a more generalized model for phoneme classification. In this paper, we present a comprehensive survey of ML tools to build phoneme recognizers. We also highlight some applications of speech (especially phoneme) recognition which illustrate the current scope as well as future prospects of APR.

Details

ISSN :
01676393
Volume :
135
Database :
OpenAIRE
Journal :
Speech Communication
Accession number :
edsair.doi...........ffa0153b9b627666f0f4da9150b96538
Full Text :
https://doi.org/10.1016/j.specom.2021.09.006