Author: "Karyna Isaieva" / Topic: computer science - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Karyna Isaieva"' showing total 7 results

Start Over Author "Karyna Isaieva" Topic computer science

7 results on '"Karyna Isaieva"'

1. Multimodal dataset of real-time 2D and static 3D MRI of healthy French speakers

Author: Ioannis Douros, Yves Laprie, Pierre-André Vuissoz, Jacques Felblinger, Karyna Isaieva, Justine Leclere, Imagerie Adaptative Diagnostique et Interventionnelle (IADI), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Centre Hospitalier Universitaire de Reims (CHU Reims), Centre d'Investigation Clinique - Innovation Technologique [Nancy] (CIC-IT), Centre d'investigation clinique [Nancy] (CIC), Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL)-Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), and Isaieva, Karyna
Subjects: Adult, Male, Statistics and Probability, Data Descriptor, Oral anatomy, Speech production, Computer science, Science, Speech recognition, Context (language use), Library and Information Sciences, 01 natural sciences, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], 030218 nuclear medicine & medical imaging, Education, Young Adult, 03 medical and health sciences, Imaging, Three-Dimensional, 0302 clinical medicine, Vocal tract images, 0103 physical sciences, Humans, Speech, Segmentation, Articulatory gestures, 010301 acoustics, Language, [SDV.IB] Life Sciences [q-bio]/Bioengineering, Communication, Middle Aged, Magnetic Resonance Imaging, Computer Science Applications, Metadata, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], Female, [SDV.IB]Life Sciences [q-bio]/Bioengineering, France, Statistics, Probability and Uncertainty, Mr images, Articulation (phonetics), Vocal tract, Information Systems
Abstract: The study of articulatory gestures has a wide spectrum of applications, notably in speech production and recognition. Sets of phonemes, as well as their articulation, are language-specific; however, existing MRI databases mostly include English speakers. In our present work, we introduce a dataset acquired with MRI from 10 healthy native French speakers. A corpus consisting of synthetic sentences was used to ensure a good coverage of the French phonetic context. A real-time MRI technology with temporal resolution of 20 ms was used to acquire vocal tract images of the participants speaking. The sound was recorded simultaneously with MRI, denoised and temporally aligned with the images. The speech was transcribed to obtain phoneme-wise segmentation of sound. We also acquired static 3D MR images for a wide list of French phonemes. In addition, we include annotations of spontaneous swallowing., Measurement(s)Vocal tract images • SpeechTechnology Type(s)Magnetic Resonance Imaging • Microphone DeviceSample Characteristic - OrganismHomo sapiens Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.16404453
Published: 2021
Full Text: View/download PDF

2. Towards the prediction of the vocal tract shape from the sequence of phonemes to be articulated

Author: Karyna Isaieva, Pierre-André Vuissoz, Yves Laprie, Justine Leclere, Vinicius Ribeiro, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Imagerie Adaptative Diagnostique et Interventionnelle (IADI), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), and Souza Ribeiro, Vinicius de Paulo
Subjects: Speech production, Generalization, Computer science, speech production, Speech recognition, Articulator, 020206 networking & telecommunications, 02 engineering and technology, [INFO] Computer Science [cs], neural networks, Euclidean distance, 030507 speech-language pathology & audiology, 03 medical and health sciences, Position (vector), Data efficiency, 0202 electrical engineering, electronic engineering, information engineering, [INFO]Computer Science [cs], 0305 other medical science, Set (psychology), phoneme-to-articulatory, Vocal tract
Abstract: International audience; In this work, we address the prediction of speech articulators' temporal geometric position from the sequence of phonemes to be articulated. We start from a set of real-time MRI sequences uttered by a female French speaker. The contours of five articulators were tracked automatically in each of the frames in the MRI video. Then, we explore the capacity of a bidirectional GRU to correctly predict each articulator's shape and position given the sequence of phonemes and their duration. We propose a 5-fold cross-validation experiment to evaluate the generalization capacity of the model. In a second experiment, we evaluate our model's data efficiency by reducing training data. We evaluate the point-to-point Euclidean distance and the Pearson's correlations along time between the predicted and the target shapes. We also evaluate produced shapes of the critical articulators of specific phonemes. We show that our model can achieve good results with minimal data, producing very realistic vocal tract shapes.
Published: 2021
Full Text: View/download PDF

3. MRI Vocal Tract Sagittal Slices Estimation during Speech Production of CV

Author: Chrysanthi Dourou, Jacques Felblinger, Ajinkya Kulkarni, Pierre-André Vuissoz, Yu Xie, Yves Laprie, Karyna Isaieva, Ioannis Douros, Laprie, Yves, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Department of Neurology, Wuhan University [China], School of of Electrical and Computer Engineering [Athens] (School of E.C.E), National Technical University of Athens [Athens] (NTUA), Centre d'investigation clinique [Nancy] (CIC), Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), Imagerie Adaptative Diagnostique et Interventionnelle (IADI), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)
Subjects: Speech production, Signal processing, speech resources enrichment, Computer science, [INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing, Speech recognition, 020206 networking & telecommunications, 02 engineering and technology, Sagittal plane, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], Set (abstract data type), RtMRI data, medicine.anatomical_structure, [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], vocal tract, 0202 electrical engineering, electronic engineering, information engineering, medicine, Frame (artificial intelligence), 020201 artificial intelligence & image processing, Image transformation, Vocal tract
Abstract: International audience; In this paper we propose an algorithm for estimating vocal tract para sagittal slices in order to have a better overview of the behaviour of the articulators during speech production. The first step is to align the consonant-vowel (CV) data of the sagittal plains between them for the train speaker. Sets of transformations that connect the midsagittal frames with the neighbouring ones is acquired for the train speaker. Another set of transformations is calculated which transforms the midsagittal frames of the train speaker to the corresponding midsagittal frames of the test speaker and is used to adapt to the test speaker domain the previously computed sets of transformations. The newly adapted transformations are applied to the midsagittal frames of the test speaker in order to estimate the neighbouring sagittal frames. Several mono speaker models are combined to produce the final frame estimation. To evaluate the results, image cross-correlation between the original and the estimated frames was used. Results show good agreement between the original and the estimated frames.
Published: 2021
Full Text: View/download PDF

4. Automatic Tongue Delineation from MRI Images with a Convolutional Neural Network Approach

Author: Alexis Houssard, Yves Laprie, Jacques Felblinger, Pierre-André Vuissoz, Nicolas Turpault, Karyna Isaieva, Imagerie Adaptative Diagnostique et Interventionnelle (IADI), Université de Lorraine (UL)-Institut National de la Santé et de la Recherche Médicale (INSERM), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Centre d'Investigation Clinique - Innovation Technologique [Nancy] (CIC-IT), Centre d'investigation clinique [Nancy] (CIC), Université de Lorraine (UL)-Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL)-Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), and Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL)-Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL)
Subjects: 0209 industrial biotechnology, [SDV.IB.IMA]Life Sciences [q-bio]/Bioengineering/Imaging, Computer science, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Convolutional neural network, Task (project management), Mri image, 020901 industrial engineering & automation, Artificial Intelligence, Tongue, 0202 electrical engineering, electronic engineering, information engineering, medicine, Computer vision, [INFO]Computer Science [cs], ComputingMilieux_MISCELLANEOUS, ComputingMethodologies_COMPUTERGRAPHICS, medicine.diagnostic_test, business.industry, Magnetic resonance imaging, medicine.anatomical_structure, Computer Science::Computer Vision and Pattern Recognition, 020201 artificial intelligence & image processing, Artificial intelligence, business, [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
Abstract: International audience; Tongue contour extraction from real-time magnetic resonance images is a nontrivial task due to the presence of artifacts manifesting in form of blurring or ghostly contours. In this work, we present results of automatic tongue delineation achieved by means of U-Net auto-encoder convolutional neural network. We present both intra- and inter-subject validation. We used real-time magnetic resonance images and manually annotated 1-pixel wide contours as inputs. Predicted probability maps were post-processed in order to obtain 1-pixel wide tongue contours. The results are very good and slightly outperform published results on automatic tongue segmentation.
Published: 2020
Full Text: View/download PDF

5. Using Silence MR Image to Synthesise Dynamic MRI Vocal Tract Data of CV

Author: Chrysanthi Dourou, Pierre-André Vuissoz, Karyna Isaieva, Yu Xie, Jacques Felblinger, Ioannis Douros, Ajinkya Kulkarni, Yves Laprie, Laprie, Yves, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Imagerie Adaptative Diagnostique et Interventionnelle (IADI), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), School of of Electrical and Computer Engineering [Athens] (School of E.C.E), National Technical University of Athens [Athens] (NTUA), Department of Neurology, Wuhan University [China], Centre d'investigation clinique [Nancy] (CIC), Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), Université de Lorraine (UL)-Institut National de la Santé et de la Recherche Médicale (INSERM), and Université de Lorraine (UL)-Centre Hospitalier Régional Universitaire de Nancy (CHRU Nancy)-Institut National de la Santé et de la Recherche Médicale (INSERM)
Subjects: Computer science, [INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing, Speech recognition, Frame (networking), 020206 networking & telecommunications, 02 engineering and technology, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], Image (mathematics), Set (abstract data type), Silence, 030507 speech-language pathology & audiology, 03 medical and health sciences, image transformation, [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], vocal tract, Dynamic contrast-enhanced MRI, 0202 electrical engineering, electronic engineering, information engineering, Mr images, 0305 other medical science, Speech resources enrichment, Vocal tract, rtMRI data, pseudo rtMRI synthesis
Abstract: International audience; In this work we present an algorithm for synthesising pseudo rtMRI data of the vocal tract. rtMRI data on the midsagittal plane were used to synthesise target consonant-vowel (CV) using only a silence frame of the target speaker. For this purpose, several single speaker models were created. The input of the algorithm is a silence frame of both train and target speaker and the rtMRI data of the target CV. An image transformation is computed from each CV frame to the next one, creating a set of transformations that describe the dynamics of the CV production. Another image transformation is computed from the silence frame of train speaker to the silence frame of the target speaker and is used to adapt the set of transformations computed previously to the target speaker. The adapted set of transformations is applied to the silence of the target speaker tosynthesise his/her CV pseudo rtMRI data. Synthesised images from multiple single speaker models are frame aligned and then averaged to create the final version of synthesised images. Synthesised images are compared with the original ones using image cross-correlation. Results show good agreement between the synthesised and the original images.
Published: 2020

6. Towards a Method of Dynamic Vocal Tract Shapes Generation by Combining Static 3D and Dynamic 2D MRI Speech Data

Author: Ioannis Douros, Anastasiia Tsukanova, Karyna Isaieva, Pierre-André Vuissoz, Yves Laprie, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Imagerie Adaptative Diagnostique et Interventionnelle (IADI), Université de Lorraine (UL)-Institut National de la Santé et de la Recherche Médicale (INSERM), Douros, Ioannis, and Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL)
Subjects: speech resources enrichment, Computer science, Image quality, 02 engineering and technology, [INFO] Computer Science [cs], Set (abstract data type), 030507 speech-language pathology & audiology, 03 medical and health sciences, Dimension (vector space), vocal tract, 0202 electrical engineering, electronic engineering, information engineering, [INFO]Computer Science [cs], Computer vision, Spatial analysis, MRI data, business.industry, Frame (networking), 020206 networking & telecommunications, image transformation, Transformation (function), [INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV], [INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV], modality transformation, Artificial intelligence, 0305 other medical science, business, Vocal tract
Abstract: International audience; We present an algorithm for augmenting the shape of the vocal tract using 3D static and 2D dynamic speech MRI data. While static 3D images have better resolution and provide spatial information, 2D dynamic images capture the transitions. The aim of this work is to combine strong points of these two types of data to obtain better image quality of 2D dynamic images and extend the 2D dynamic images to the 3D domain. To produce a 3D dynamic consonant-vowel (CV) sequence, our algorithm takes as input the 2D CV transition and the static 3D targets for C and V. To obtain the enhanced sequence of images , the first step is to find a transformation between the 2D images and the mid-sagittal slice of the acoustically corresponding 3D image stack, and then find a transformation between neighbouring sagittal slices in the 3D static image stack. Combination of these transformations allows producing the final set of images. In the present study we first examined the transformation from the 3D mid-sagittal frame to the 2D video in order to improve image quality and then we examined the extension of the 2D video to the 3rd dimension with the aim to enrich spatial information.
Published: 2019
Full Text: View/download PDF

7. A Multimodal Real-Time MRI Articulatory Corpus of French for Speech Research

Author: Arun A. Joseph, Jens Frahm, Dirk Voit, Yves Laprie, Karyna Isaieva, Freddy Odille, Ioannis Douros, Anastasiia Tsukanova, Jacques Felblinger, Pierre-André Vuissoz, Laprie, Yves, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Imagerie Adaptative Diagnostique et Interventionnelle (IADI), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), Biomedizinische NMR Forschungs GmbH [Göttingen], Max-Planck-Institut für Biophysikalische Chemie - Max Planck Institute for Biophysical Chemistry [Göttingen], Max-Planck-Gesellschaft-Max-Planck-Gesellschaft, and Université de Lorraine (UL)-Institut National de la Santé et de la Recherche Médicale (INSERM)
Subjects: Larynx, [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], Speech production, Epiglottis, Computer science, speech production, Speech recognition, 02 engineering and technology, speech syn- thesis, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], 030507 speech-language pathology & audiology, 03 medical and health sciences, Tongue, 0202 electrical engineering, electronic engineering, information engineering, medicine, Index Terms: speech corpus, Spontaneous speech, multi-modal database, 020206 networking & telecommunications, Real-time MRI, medicine.anatomical_structure, Duration (music), French language, real-time MRI data, 0305 other medical science, Vocal tract, 3D MRI data
Abstract: International audience; In this work we describe the creation of ArtSpeechMRIfr: a real-time as well as static magnetic resonance imaging (rtMRI, 3D MRI) database of the vocal tract. The database contains also processed data: denoised audio, its phonetically aligned annotation, articulatory contours, and vocal tract volume information , which provides a rich resource for speech research. The database is built on data from two male speakers of French. It covers a number of phonetic contexts in the controlled part, as well as spontaneous speech, 3D MRI scans of sustained vocalic articulations, and of the dental casts of the subjects. The corpus for rtMRI consists of 79 synthetic sentences constructed from a phonetized dictionary that makes possible to shorten the duration of acquisitions while keeping a very good coverage of the phonetic contexts which exist in French. The 3D MRI includes acquisitions for 12 French vowels and 10 consonants, each of which was pronounced in several vocalic contexts. Ar-ticulatory contours (tongue, jaw, epiglottis, larynx, velum, lips) as well as 3D volumes were manually drawn for a part of the images.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

7 results on '"Karyna Isaieva"'

1. Multimodal dataset of real-time 2D and static 3D MRI of healthy French speakers

2. Towards the prediction of the vocal tract shape from the sequence of phonemes to be articulated

3. MRI Vocal Tract Sagittal Slices Estimation during Speech Production of CV

4. Automatic Tongue Delineation from MRI Images with a Convolutional Neural Network Approach

5. Using Silence MR Image to Synthesise Dynamic MRI Vocal Tract Data of CV

6. Towards a Method of Dynamic Vocal Tract Shapes Generation by Combining Static 3D and Dynamic 2D MRI Speech Data

7. A Multimodal Real-Time MRI Articulatory Corpus of French for Speech Research

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

7 results on '"Karyna Isaieva"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources