Author: "Peter Birkholz" / Journal: ieee/acm transactions on audio, speech, and language processing - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Peter Birkholz"' showing total 4 results

Start Over Author "Peter Birkholz" Journal ieee/acm transactions on audio, speech, and language processing

4 results on '"Peter Birkholz"'

1. Artificial Vocal Learning Guided by Phoneme Recognition and Visual Information

Author: Paul Konstantin Krug, Peter Birkholz, Branislav Gerazov, Daniel Rudolph van Niekerk, Anqi Xu, and Yi Xu
Subjects: Computational Mathematics, Acoustics and Ultrasonics, Computer Science (miscellaneous), Electrical and Electronic Engineering
Abstract: This paper introduces a paradigm shift regarding vocal learning simulations, in which the communicative function of speech acquisition determines the learning process and intelligibility is considered the main measure of learning success. Thereby, a novel approach for artificial early vocal learning is presented that utilizes deep neural network-based phoneme recognition in order to calculate the speech acquisition objective function. This function guides a learning framework that involves the state-of-the-art articulatory speech synthesizer VocalTractLab as the motor-to-acoustic forward model. It is shown that in this way an extensive set of German phonemes consisting of most German consonants and all stressed vowels can be produced successfully. The synthetic phonemes were rated as highly intelligible by human listeners in a listening experiment. Furthermore, it is shown that visual speech information, such as lip and jaw movements can be extracted from video recordings and be incorporated into the learning framework as an additional loss component during the optimization process. It was observed that this visual loss did not increase the overall intelligibility of phonemes. Instead, the visual loss acted as a regularization mechanism that facilitated the finding of more biologically plausible solutions in the articulatory domain.
Published: 2023

2. Articulatory Synthesis of Vocalized /r/ Allophones in German

Author: Yingming Gao, Simon Stone, and Peter Birkholz
Subjects: German, Articulatory synthesis, Computational Mathematics, Acoustics and Ultrasonics, Speech recognition, Computer Science (miscellaneous), language, Electrical and Electronic Engineering, Psychology, language.human_language
Published: 2022

3. Non-Invasive Silent Phoneme Recognition Using Microwave Signals

Author: Klaus Wolf, Dirk Plettemeier, Peter Birkholz, and Simon Stone
Subjects: Acoustics and Ultrasonics, Computer science, Frequency band, Speech recognition, 020206 networking & telecommunications, 02 engineering and technology, Linear discriminant analysis, Speech processing, law.invention, 030507 speech-language pathology & audiology, 03 medical and health sciences, Computational Mathematics, Silent speech interface, law, 0202 electrical engineering, electronic engineering, information engineering, Computer Science (miscellaneous), Reflection (physics), Electrical and Electronic Engineering, Radar, Antenna (radio), 0305 other medical science, Vocal tract
Abstract: Besides the recognition of audible speech, there is currently an increasing interest in the recognition of silent speech, which has a range of novel applications. A major obstacle for a wide spread of silent-speech technology is the lack of measurement methods for speech movements that are convenient, non-invasive, portable, and robust at the same time. Therefore, as an alternative to established methods, we examined to what extent different phonemes can be discriminated from the electromagnetic transmission and reflection properties of the vocal tract. To this end, we attached two Vivaldi antennas on the cheek and below the chin of two subjects. While the subjects produced 25 phonemes in multiple phonetic contexts each, we measured the electromagnetic transmission spectra from one antenna to the other, and the reflection spectra for each antenna (radar), in a frequency band from 2–12 GHz. Two classification methods ( k -nearest neighbors and linear discriminant analysis) were trained to predict the phoneme identity from the spectral data. With linear discriminant analysis, cross-validated phoneme recognition rates of 93% and 85% were achieved for the two subjects. Although these results are speaker- and session-dependent, they suggest that electromagnetic transmission and reflection measurements of the vocal tract have great potential for future silent-speech interfaces.
Published: 2018

4. Construction and Evaluation of a Parametric One-Dimensional Vocal Tract Model

Author: Simon Stone, Peter Birkholz, and Michael Marxen
Subjects: 0301 basic medicine, Consonant, Acoustics and Ultrasonics, Computer science, Speech recognition, Speech synthesis, Function (mathematics), computer.software_genre, 01 natural sciences, Maximum error, Set (abstract data type), 03 medical and health sciences, Computational Mathematics, 030104 developmental biology, Simple (abstract algebra), 0103 physical sciences, Computer Science (miscellaneous), Electrical and Electronic Engineering, 010301 acoustics, computer, Vocal tract, Parametric statistics
Abstract: Articulatory speech synthesis based on aero-acoustic simulations of the vocal tract is computationally expensive and, therefore, requires simple yet precise models. Modeling the one-dimensional vocal tract area function directly instead of a higher dimensional vocal tract model is an efficient way to minimize the computational overhead of the simulations. In this paper, we propose a new parametric vocal tract model that is controlled by six points and capable of modeling a large variety of vocal tract shapes. We geometrically and perceptually evaluated the model on a set of 22 reference area functions corresponding to German vowels and consonants. The model was able to geometrically approximate the reference area functions with a minimum root-mean-square error of 0.302 cm$^2$, a maximum error of 1.142 cm$^2$, and a median error of 0.891 cm$^2$. After optimizations, a perceptual evaluation of the synthesis using our model in combination with a state-of-the-art aero-acoustic simulation achieved a vowel recognition rate of 90.7% and a consonant recognition rate of 73.2%.
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"Peter Birkholz"'

1. Artificial Vocal Learning Guided by Phoneme Recognition and Visual Information

2. Articulatory Synthesis of Vocalized /r/ Allophones in German

3. Non-Invasive Silent Phoneme Recognition Using Microwave Signals

4. Construction and Evaluation of a Parametric One-Dimensional Vocal Tract Model

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

4 results on '"Peter Birkholz"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources