1. The Time Course of Audio-Visual Phoneme Identification: a High Temporal Resolution Study
- Author
-
Sonia Kandel, Salvador Soto-Faraco, Christophe Savariaux, Carolina Sánchez-García, Universitat Pompeu Fabra [Barcelona] (UPF), GIPSA - Voix Systèmes Linguistiques et Dialectologie (GIPSA-VSLD), Département Parole et Cognition (GIPSA-DPC), Grenoble Images Parole Signal Automatique (GIPSA-lab ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Grenoble Images Parole Signal Automatique (GIPSA-lab ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019]), GIPSA-Services (GIPSA-Services), and Institució Catalana de Recerca i Estudis Avançats (ICREA)
- Subjects
Consonant ,Speech perception ,Cognitive Neuroscience ,media_common.quotation_subject ,Speech recognition ,Experimental and Cognitive Psychology ,speech perception ,050105 experimental psychology ,03 medical and health sciences ,[SCCO]Cognitive science ,0302 clinical medicine ,Perception ,0501 psychology and cognitive sciences ,media_common ,Communication ,Modality (human–computer interaction) ,business.industry ,multisensory integration ,05 social sciences ,Multisensory integration ,Audio-visual ,[SCCO.LING]Cognitive science/Linguistics ,Speech processing ,Sensory Systems ,Ophthalmology ,Task (computing) ,Identification (information) ,gating ,[SCCO.PSYC]Cognitive science/Psychology ,Computer Vision and Pattern Recognition ,business ,Psychology ,030217 neurology & neurosurgery - Abstract
International audience; Speech unfolds in time and, as a consequence, its perception requires temporal integration. Yet, studies addressing audio-visual speech processing have often overlooked this temporal aspect. Here, we address the temporal course of audio-visual speech processing in a phoneme identification task using a Gating paradigm. We created disyllabic Spanish word-like utterances (e.g., /pafa/, /paθa/, …) from high-speed camera recordings. The stimuli differed only in the middle consonant (/f/, /θ/, /s/, /r/, /g/), which varied in visual and auditory saliency. As in classical Gating tasks, the utterances were presented in fragments of increasing length (gates), here in 10 ms steps, for identification and confidence ratings. We measured correct identification as a function of time (at each gate) for each critical consonant in audio, visual and audio-visual conditions, and computed the Identification Point and Recognition Point scores. The results revealed that audio-visual identification is a time-varying process that depends on the relative strength of each modality (i.e., saliency). In some cases, audio-visual identification followed the pattern of one dominant modality (either A or V), when that modality was very salient. In other cases, both modalities contributed to identification, hence resulting in audio-visual advantage or interference with respect to unimodal conditions. Both unimodal dominance and audio-visual interaction patterns may arise within the course of identification of the same utterance, at different times. The outcome of this study suggests that audio-visual speech integration models should take into account the time-varying nature of visual and auditory saliency.
- Published
- 2018
- Full Text
- View/download PDF