Survey Paper:Introduction to Generative Adversarial Networks in Speech Imitation

Authors :: Aruna Bhat
Rajashree Guha
Vaibhav Sharma
Yatin Yadav
Source :: 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS).
Publication Year :: 2020
Publisher :: IEEE, 2020.
Abstract: Speech imitation, that is, the method of converting a source speaker’s utterance in a manner such that it sounds as though it were uttered by the target has been an active and volatile area of research in current times. The paper focuses on the techniques of modifying sounds emitted by the original speaker (also called the source). Speech Imitation involves metamorphosing the source speaker’s speech to present it in a way as though it were produced by the selected speaker (also called the target). Most systems specializing in speech imitation compulsorily necessitate the usage of parallel data to construct natural or unprocessed sounding speech. Examples of such systems would include RBM [1]. However, there are various obstacles associated with parallel data such as it being an arduous task to gather in real-time and the requirement for the vocabulary to be an exact match which perhaps might not be available to train the model. To remove such dependency on parallel data, a recently innovated model, GANs [3] are being explored.

Subjects :: Vocabulary
Dependency (UML)
Computer science
media_common.quotation_subject
Speech recognition
020206 networking & telecommunications
02 engineering and technology
Construct (python library)
Task (project management)
0202 electrical engineering, electronic engineering, information engineering
Natural (music)
020201 artificial intelligence & image processing
Imitation (music)
Utterance
Generative grammar
media_common

Database :: OpenAIRE
Journal :: 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS)
Accession number :: edsair.doi...........0b71fb746947aa167bf8405d76217d26

Tools