Back to Search Start Over

How much does prosody help word segmentation? A simulation study on infant-directed speech

Authors :
Emmanuel Dupoux
Reiko Mazuka
Alejandrina Cristia
Bogdan Ludusan
RIKEN Center for Brain Science [Wako] (RIKEN CBS)
RIKEN - Institute of Physical and Chemical Research [Japon] (RIKEN)
Department of Psychology and Neuroscience
Duke University [Durham]
Laboratoire de sciences cognitives et psycholinguistique (LSCP)
Département d'Etudes Cognitives - ENS Paris (DEC)
École normale supérieure - Paris (ENS-PSL)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS-PSL)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)
Apprentissage machine et développement cognitif (CoML)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-Département d'Etudes Cognitives - ENS Paris (DEC)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)
École des hautes études en sciences sociales (EHESS)
The research reported in this paper was partly funded by JSPS Grant-in-Aid for Scientific Research (16H06319, 20H05617) and MEXT Grant-in-aid on Innovative Areas #4903 (Co-creative Language Evolution), 17H06382 to RM. It was also supported by the European Research Council (ERC-2011-AdG-295810 BOOTPHON), the Agence Nationale pour la Recherche (ANR-17-CE28-0007 LangAge, ANR-16-DATA-0004 ACLEW, ANR-14-CE30-0003 MechELex, ANR-17-EURE-0017 Frontcog, ANR-10-IDEX-0001-02 PSL*, ANR-19-P3IA-0001 PRAIRIE 3IA Institute). ED is further grateful to the CIFAR (Learning in Machines and Brain), BL to the Canon Foundation in Europe, and AC to the JS McDonnell Foundation.
ANR-17-CE28-0007,LangAge,Différences dans l'apprenabilité du langage selon l'âge(2017)
ANR-16-DATA-0004,ACLEW,Analyzing Child Language Experiences Around the World(2016)
ANR-14-CE30-0003,MechELex,Méchanismes d'acquisition lexicale précoce(2014)
ANR-17-EURE-0017,FrontCog,Frontières en cognition(2017)
ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019)
European Project: 295810,EC:FP7:ERC,ERC-2011-ADG_20110406,BOOTPHON(2012)
École normale supérieure - Paris (ENS Paris)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS Paris)
Source :
Cognition, Cognition, 2022, 219, pp.104961. ⟨10.1016/j.cognition.2021.104961⟩, Cognition, Elsevier, 2022, 219, pp.104961. ⟨10.1016/j.cognition.2021.104961⟩
Publication Year :
2022
Publisher :
Elsevier, 2022.

Abstract

International audience; Infants come to learn several hundreds of word forms by two years of age, and it is possible this involves carving these forms out from continuous speech. It has been proposed that the task is facilitated by the presence of prosodic boundaries. We revisit this claim by running computational models of word segmentation, with and without prosodic information, on a corpus of infant-directed speech. We use five cognitively-based algorithms, which vary in whether they employ a sub-lexical or a lexical segmentation strategy and whether they are simple heuristics or embody an ideal learner. Results show that providing expert-annotated prosodic breaks does not uniformly help all segmentation models. The sub-lexical algorithms, which perform more poorly, benefit most, while the lexical ones show a very small gain. Moreover, when prosodic information is derived automatically from the acoustic cues infants are known to be sensitive to, errors in the detection of the boundaries lead to smaller positive effects, and even negative ones for some algorithms. This shows that even though infants could potentially use prosodic breaks, it does not necessarily follow that they should incorporate prosody into their segmentation strategies, when confronted with realistic signals.

Details

Language :
English
ISSN :
00100277 and 18737838
Database :
OpenAIRE
Journal :
Cognition, Cognition, 2022, 219, pp.104961. ⟨10.1016/j.cognition.2021.104961⟩, Cognition, Elsevier, 2022, 219, pp.104961. ⟨10.1016/j.cognition.2021.104961⟩
Accession number :
edsair.doi.dedup.....efb86f499c0bf816049c73868aefb6a9