Back to Search Start Over

Dynamic Bayesian Networks for multi-band automatic speech recognition

Authors :
Christophe Antoine
Khalid Daoudi
Dominique Fohr
Analysis, perception and recognition of speech (PAROLE)
INRIA Lorraine
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS)
Source :
Computer Speech and Language, Computer Speech and Language, Elsevier, 2003, 17 (2-3), pp.263-285. ⟨10.1016/S0885-2308(03)00011-1⟩, Computer Speech and Language, 2003, 17 (2-3), pp.263-285. ⟨10.1016/S0885-2308(03)00011-1⟩
Publication Year :
2003
Publisher :
HAL CCSD, 2003.

Abstract

Article dans revue scientifique avec comité de lecture.; This paper presents a new approach to multi-band automatic speech recognition which has the advantage to overcome many limitations of classical muti-band systems. The principle of this new approach is to build a speech model in the time-frequency domain using the formalism of dynamic Bayesian networks. In contrast to classical multi-band modeling, this formalism leads to a probabilistic speech model which allows communications between the different sub-bands and, consequently, no recombination step is required in recognition. We develop efficient learning and decoding algorithms both for isolated and continuous speech recognition. We present illustrative experiments on isolated and connected digit recognition tasks. These experiments show that the this new approach is very promising in the field of noisy speech recognition.

Details

Language :
English
ISSN :
08852308 and 10958363
Database :
OpenAIRE
Journal :
Computer Speech and Language, Computer Speech and Language, Elsevier, 2003, 17 (2-3), pp.263-285. ⟨10.1016/S0885-2308(03)00011-1⟩, Computer Speech and Language, 2003, 17 (2-3), pp.263-285. ⟨10.1016/S0885-2308(03)00011-1⟩
Accession number :
edsair.doi.dedup.....b2cb39d1be95486d35edf69a65c3da3b
Full Text :
https://doi.org/10.1016/S0885-2308(03)00011-1⟩