1. Statistical learning models of early phonetic acquisition struggle with child-centered audio data
- Author
-
Marvin Lavechin, Maureen de Seyssel, Marianne Métais, Florian Metz, Abdelrahman Mohame, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia, Laboratoire de sciences cognitives et psycholinguistique (LSCP), Département d'Etudes Cognitives - ENS Paris (DEC), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS), Apprentissage machine et développement cognitif (CoML), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-Département d'Etudes Cognitives - ENS Paris (DEC), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Meta AI Research [Paris], Meta AI, Laboratoire de Linguistique Formelle (LLF - UMR7110), Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité), Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI), Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT), Centre National de la Recherche Scientifique (CNRS), CIFAR (Learning in Machines and Brains), ANR-16-DATA-0004,ACLEW,Analyzing Child Language Experiences Around the World(2016), ANR-17-EURE-0017,FrontCog,Frontières en cognition(2017), ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), and European Project: 101001095,ExELang
- Subjects
Perceptual attunement ,Domain-general ,Phonetic learning ,Language acquisition ,Domain-specific ,Child-centered long-forms ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] - Abstract
International audience; Infants learn their native language(s) at an amazing speed. Before they even talk, their perception adapts to the language(s) they hear. However, the mechanisms responsible for this perceptual attunement still remain unclear. A long tradition in linguistics points to the importance of specialized language mechanisms that would allow us to quickly and effortlessly learn from the language(s) we are exposed to. However, the currently dominant explanation for perceptual attunement posits that infants apply a domain-general learning mechanism consisting in learning statistical regularities from the speech stream they hear, and which may be found in learning across domains and across species. Critically, the feasibil- ity of employing purely domain-general statistical learning mechanisms has only been demonstrated with computational models on unrealistic and simplified input. This paper presents the first attempt to study perceptual attunement with 2,000 hours of ecological child-centered recordings in American English and Metropolitan French. We show that, when applied on ecologically-valid data, generic learning mecha- nisms develop a language-relevant perceptual space but fail to show evidence for perceptual attunement. It is only when supplemented with domain-specific audio filtering and augmentation mechanisms that computational models show a significant attunement to the language they have been exposed to. Hence, we conclude that, when learning from ecological audio, domain-specific mechanisms may be necessary to guide early language learning in the wild even if the learning itself is done through generic mechanisms. We anticipate our work to be a starting point for ecologically-valid computational models of perceptual attunement in other domains and species.
- Published
- 2023