Start Over

Generative Adversarial Networks (GANs) for Audio-Visual Speech Recognition in Artificial Intelligence IoT.

Authors :: He, Yibo
Seng, Kah Phooi
Ang, Li Minn
Source :: Information (2078-2489). Oct2023, Vol. 14 Issue 10, p575. 23p.
Publication Year :: 2023
Abstract: This paper proposes a novel multimodal generative adversarial network AVSR (multimodal AVSR GAN) architecture, to improve both the energy efficiency and the AVSR classification accuracy of artificial intelligence Internet of things (IoT) applications. The audio-visual speech recognition (AVSR) modality is a classical multimodal modality, which is commonly used in IoT and embedded systems. Examples of suitable IoT applications include in-cabin speech recognition systems for driving systems, AVSR in augmented reality environments, and interactive applications such as virtual aquariums. The application of multimodal sensor data for IoT applications requires efficient information processing, to meet the hardware constraints of IoT devices. The proposed multimodal AVSR GAN architecture is composed of a discriminator and a generator, each of which is a two-stream network, corresponding to the audio stream information and the visual stream information, respectively. To validate this approach, we used augmented data from well-known datasets (LRS2-Lip Reading Sentences 2 and LRS3) in the training process, and testing was performed using the original data. The research and experimental results showed that the proposed multimodal AVSR GAN architecture improved the AVSR classification accuracy. Furthermore, in this study, we discuss the domain of GANs and provide a concise summary of the proposed GANs. [ABSTRACT FROM AUTHOR]

Subjects :: *GENERATIVE adversarial networks
*ARTIFICIAL intelligence
*SPEECH perception
*AUTOMATIC speech recognition
*INTERNET of things
*STREAMING audio

Details

Language :: English
ISSN :: 20782489
Volume :: 14
Issue :: 10
Database :: Academic Search Index
Journal :: Information (2078-2489)
Publication Type :: Academic Journal
Accession number :: 173315221
Full Text :: https://doi.org/10.3390/info14100575

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Generative Adversarial Networks (GANs) for Audio-Visual Speech Recognition in Artificial Intelligence IoT.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Generative Adversarial Networks (GANs) for Audio-Visual Speech Recognition in Artificial Intelligence IoT.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources