Sophie Herment, Anne Tortel, Brigitte Bigi, Daniel Hirst, Anastassia Loukina, Laboratoire Parole et Langage (LPL), Aix Marseille Université (AMU)-Centre National de la Recherche Scientifique (CNRS), BCL, équipe Diachronie, Dialectologie, et Phonologie (DDP) [2008..2015], Bases, Corpus, Langage (UMR 7320 - UNS / CNRS) (BCL), Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015 - 2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015 - 2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015 - 2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015 - 2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS), ANA Diaz-Negrillo, FRANCISCO JAVIER Diaz-Perez (eds.), ANR-16-CONV-0002,ILCB,Institute of Language Communication and the Brain(2017), Bases, Corpus, Langage (UMR 7320 - UCA / CNRS) (BCL), Université Côte d'Azur (UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Université Côte d'Azur (UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA), Díaz Pérez J., Díaz Negrillo A., ANR-16-CONV-0002,ILCB,ILCB: Institute of Language Communication and the Brain(2016), Université Nice Sophia Antipolis (1965 - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA), and Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)
This paper presents a multilingual learners corpus, AixOx, collect-ed in the framework of an Alliance project (a partnership between the British Council and The French Ministry of Foreign Affairs). The corpus consists of the recording of 40 1-minute passages in English and French from the Eurom 1 corpus (Chan et al., 1995), read by native speakers and L2 learners. French native speakers reading the French and English passages were recorded in Aix-en-Provence, and English native speakers reading the English and French passages were recorded in Oxford. The AixOx corpus con-tains about 40 hours of read speech and can be downloaded from the “Speech and Language Data Repository” (http://sldr.org). This paper also presents the tools used for automatic anno-tation on several layers using algorithms: •SPPAS –SPeech Phonetization Alignment and Syllabifica-tion– (Bigi, 2012) for a segmentation into utterances, words, syllables and phonemes;•MoMel –Modelling Melody– and INTSINT –INternational Transcription System for INTonation– (Hirst, 2007) for the modelling and coding of intonation.Finally, an example of a pedagogical application of the cor-pus is given: a pilot-study on the intonation of questions. We show how the AixOx corpus can be used to compare the produc-tions of natives with learners and how it is possible, thanks to the annotation, to understand the prosodic realisations (whether they be positive or negative) and explain them. We conclude that AixOx, with its multi-layered annotation, is a very rich oral data-base for all kinds of studies on L1 productions, L2 productions, language contact, both at the segmental and supra-segmental levels since it offers a phonemic segmentation and alignment and a pro-sodic labelling.