223 results on '"Favre, Benoit"'
Search Results
102. Detecting person presence in TV shows with linguistic and structural features
- Author
-
Bechet, Frederic, primary, Favre, Benoit, additional, and Damnati, Geraldine, additional
- Published
- 2012
- Full Text
- View/download PDF
103. Applying Multiclass Bandit algorithms to call-type classification
- Author
-
Ralaivola, Liva, primary, Favre, Benoit, additional, Gotab, Pierre, additional, Bechet, Frederic, additional, and Damnati, Geraldine, additional
- Published
- 2011
- Full Text
- View/download PDF
104. Semi-supervised part-of-speech tagging in speech applications
- Author
-
Dufour, Richard, primary and Favre, Benoit, additional
- Published
- 2010
- Full Text
- View/download PDF
105. The CALO Meeting Assistant System
- Author
-
Tur, Gokhan, primary, Stolcke, Andreas, additional, Voss, Lynn, additional, Peters, Stanley, additional, Hakkani-Tur, Dilek, additional, Dowding, John, additional, Favre, Benoit, additional, Fernandez, Raquel, additional, Frampton, Matthew, additional, Frandsen, Mike, additional, Frederickson, Clint, additional, Graciarena, Martin, additional, Kintzing, Donald, additional, Leveque, Kyle, additional, Mason, Shane, additional, Niekrasz, John, additional, Purver, Matthew, additional, Riedhammer, Korbinian, additional, Shriberg, Elizabeth, additional, Tien, Jing, additional, Vergyri, Dimitra, additional, and Yang, Fan, additional
- Published
- 2010
- Full Text
- View/download PDF
106. Evaluation of semantic role labeling and dependency parsing of automatic speech recognition output
- Author
-
Favre, Benoit, primary, Bohnet, Bernd, additional, and Hakkani-Tur, Dilek, additional
- Published
- 2010
- Full Text
- View/download PDF
107. Integrating prosodic features in extractive meeting summarization
- Author
-
Xie, Shasha, primary, Hakkani-Tur, Dilek, additional, Favre, Benoit, additional, and Liu, Yang, additional
- Published
- 2009
- Full Text
- View/download PDF
108. Any questions? Automatic question detection in meetings
- Author
-
Boakye, Kofi, primary, Favre, Benoit, additional, and Hakkani-Tur, Dilek, additional
- Published
- 2009
- Full Text
- View/download PDF
109. Phrase and word level strategies for detecting appositions in speech
- Author
-
Favre, Benoit, primary and Hakkani-Tür, Dilek, additional
- Published
- 2009
- Full Text
- View/download PDF
110. Leveraging sentence weights in a concept-based optimization framework for extractive meeting summarization
- Author
-
Xie, Shasha, primary, Favre, Benoit, additional, Hakkani-Tür, Dilek, additional, and Liu, Yang, additional
- Published
- 2009
- Full Text
- View/download PDF
111. Clusterrank: a graph based method for meeting summarization
- Author
-
Garg, Nikhil, primary, Favre, Benoit, additional, Reidhammer, Korbinian, additional, and Hakkani-Tür, Dilek, additional
- Published
- 2009
- Full Text
- View/download PDF
112. Combined low level and high level features for out-of-vocabulary word detection
- Author
-
Lecouteux, Benjamin, primary, Linarès, Georges, additional, and Favre, Benoit, additional
- Published
- 2009
- Full Text
- View/download PDF
113. Generative and Discriminative Methods Using Morphological Information for Sentence Segmentation of Turkish
- Author
-
Guz, Umit, primary, Favre, Benoit, additional, Hakkani-Tur, Dilek, additional, and Tur, Gokhan, additional
- Published
- 2009
- Full Text
- View/download PDF
114. Syntactically-informed models for comma prediction
- Author
-
Favre, Benoit, primary, Hakkani-Tur, Dilek, additional, and Shriberg, Elizabeth, additional
- Published
- 2009
- Full Text
- View/download PDF
115. A global optimization framework for meeting summarization
- Author
-
Gillick, Dan, primary, Riedhammer, Korbinian, additional, Favre, Benoit, additional, and Hakkani-Tur, Dilek, additional
- Published
- 2009
- Full Text
- View/download PDF
116. ICSI-CRF
- Author
-
Favre, Benoit, primary and Bohnet, Bernd, additional
- Published
- 2009
- Full Text
- View/download PDF
117. A scalable global model for summarization
- Author
-
Gillick, Dan, primary and Favre, Benoit, additional
- Published
- 2009
- Full Text
- View/download PDF
118. A keyphrase based approach to interactive meeting summarization
- Author
-
Riedhammer, Korbinian, primary, Favre, Benoit, additional, and Hakkani-Tur, Dilek, additional
- Published
- 2008
- Full Text
- View/download PDF
119. Efficient sentence segmentation using syntactic features
- Author
-
Favre, Benoit, primary, Hakkani-Tur, Dilek, additional, Petrov, Slav, additional, and Klein, Dan, additional
- Published
- 2008
- Full Text
- View/download PDF
120. Packing the meeting summarization knapsack
- Author
-
Riedhammer, Korbinian, primary, Gillick, Dan, additional, Favre, Benoit, additional, and Hakkani-Tür, Dilek, additional
- Published
- 2008
- Full Text
- View/download PDF
121. Speech segmentation and spoken document processing
- Author
-
Ostendorf, Mari, primary, Favre, Benoit, additional, Grishman, Ralph, additional, Hakkani-Tur, Dilek, additional, Harper, Mary, additional, Hillard, Dustin, additional, Hirschberg, Julia, additional, Ji, Heng, additional, Kahn, Jeremy G., additional, Liu, Yang, additional, Maskey, Sameer, additional, Matusov, Evgeny, additional, Ney, Hermann, additional, Rosenberg, Andrew, additional, Shriberg, Elizabeth, additional, Wang, Wen, additional, and Wooters, Chuck, additional
- Published
- 2008
- Full Text
- View/download PDF
122. Cross-Genre Feature Comparisons for Spoken Sentence Segmentation
- Author
-
Cuendet, Sebastien, primary, Hakkani-Tur, Dilek, additional, Shriberg, Elizabeth, additional, Fung, James, additional, and Favre, Benoit, additional
- Published
- 2007
- Full Text
- View/download PDF
123. An interactive timeline for speech database browsing
- Author
-
Favre, Benoit, primary, Bonastre, Jean-François, additional, and Bellot, Patrice, additional
- Published
- 2007
- Full Text
- View/download PDF
124. CROSS-GENRE FEATURE COMPARISONS FOR SPOKEN SENTENCE SEGMENTATION.
- Author
-
CUENDET, SEBASTIEN, HAKKANI-TUR, DILEK, SHRIBERG, ELIZABETH, FUNG, JAMES, and FAVRE, BENOIT
- Published
- 2007
125. The LIA summarization system at DUC-2007
- Author
-
Favre, Benoit, Gillard, Laurent, and Juan-Manuel Torres-Moreno
126. Contextual language understanding Thoughts on Machine Learning in Natural Language Processing
- Author
-
Benoit Favre, Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Universite, Thierry Artières, and Favre, Benoit
- Subjects
[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,systèmes multimodaux ,[INFO.INFO-TT] Computer Science [cs]/Document and Text Processing ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,traitement du langage naturel ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing ,machine learning ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,apprentissage automatique ,multimodal systems ,natural language processing - Abstract
This document is a habilitation à diriger des recherches (HDR) thesis. It is organized in two parts: The first part presents a reflection on my work and the state of the Natural Language Processingcommunity ; The second part is an overview of my activity, including a detailed CV, a summary of the work of the PhD students I contributed to advising, and a list of my personal publications.Self-citations are postfixed with and listed in Chapter 9, while external references are listed in the bibliography at the end of the document. Each contribution chapter ends with a section listing PhDstudent work related to that chapter.
- Published
- 2019
127. Apprentissage d'agents conversationnels pour la gestion de relations clients
- Author
-
Benoit Favre, Frédéric Béchet, Géraldine Damnati, Delphine Charlet, Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), France Télécom Recherche et Développement [Lannion] (FTR&D), France Télécom, Orange Labs [Lannion], ANR-15-CE23-0003,DATCHA,Extraction de connaissances à partir de vastes corpus de conversations 'chat' client-opérateurs(2015), Laboratoire d'informatique Fondamentale de Marseille ( LIF ), Aix Marseille Université ( AMU ) -Ecole Centrale de Marseille ( ECM ) -Centre National de la Recherche Scientifique ( CNRS ), France Télécom Recherche et Développement [Lannion] ( FTR&D ), Favre, Benoit, Extraction de connaissances à partir de vastes corpus de conversations 'chat' client-opérateurs - - DATCHA2015 - ANR-15-CE23-0003 - AAPG2015 - VALID, Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), and Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[ INFO.INFO-TT ] Computer Science [cs]/Document and Text Processing ,[ INFO.INFO-TS ] Computer Science [cs]/Signal and Image Processing ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,GRU ,[INFO.INFO-TT] Computer Science [cs]/Document and Text Processing ,ACM : I.: Computing Methodologies/I.2: ARTIFICIAL INTELLIGENCE/I.2.7: Natural Language Processing ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,seq2seq ,[ INFO.INFO-LG ] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,ACM: I.: Computing Methodologies/I.2: ARTIFICIAL INTELLIGENCE/I.2.7: Natural Language Processing ,[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing ,Chatbots ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,[ INFO.INFO-CL ] Computer Science [cs]/Computation and Language [cs.CL] ,LSTM - Abstract
This work demonstrates the feasability of training chatbots on customer relation conversation traces.Systems based on language models, information retrieval and machine translation are compared., Ce travail démontre la faisabilité d'entraîner des chatbots sur des traces de conversations dans le domaine de la relation client. Des systèmes à base de modèles de langage, de recherche d'information et de traduction sont comparés pour la tâche. ABSTRACT Training chatbots for customer relation management This work demonstrates the feasability of training chatbots on customer relation conversation traces. Systems based on language models, information retrieval and machine translation are compared.
- Published
- 2017
128. Fusion d'espaces de représentations multimodaux pour la reconnaissance du rôle du locuteur dans des documents télévisuels
- Author
-
Sebastien Delecraz, Frédéric Béchet, Benoit Favre, Mickael Rouvier, Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI, Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), and Favre, Benoit
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Identification du rôle du locuteur ,[INFO] Computer Science [cs] ,Emissions de télévision ,Broadcast news ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,Speaker role recognition ,[INFO]Computer Science [cs] ,Multimodal speaker embeddings ,Fusion multimodale - Abstract
Person role recognition in video broadcasts consists in classifying people into roles such as anchor, journalist, guest, etc. Existing approaches mostly consider one modality, either audio (speaker role recognition) or image (shot role recognition), firstly because of the non-synchrony between both modalities, and secondly because of the lack of a video corpus annotated in both modalities. Deep Neural Networks (DNN) approaches offer the ability to learn simultaneously feature representations (embeddings) and classification functions. This paper presents a multimodal fusion of audio, text and image embeddings spaces for speaker role recognition in asynchronous data. Monomodal embeddings are trained on exogenous data and fine-tuned using a DNN on 70 hours of French Broadcasts corpus for the target task. Experiments on the REPERE corpus show the benefit of the embeddings level fusion compared to the monomodal embeddings systems and to the standard late fusion method., L'identification du rôle d'un locuteur dans des émissions de télévision est un problème de classification de personne selon une liste de rôles comme présentateur, journaliste, invité, etc. À cause de la non-synchronie entre les modalités, ainsi que par le manque de corpus de vidéos annotées dans toutes les modalités, seulement une des modalités est souvent utilisée. Nous présentons dans cet article une fusion multimodale des espaces de représentations de l'audio, du texte et de l'image pour la reconnaissance du rôle du locuteur pour des données asynchrones. Les espaces de représentations monomodaux sont entraînés sur des corpus de données exogènes puis ajustés en utilisant des réseaux de neurones profonds sur un corpus d'émissions françaises pour notre tâche de classification. Les expériences réalisées sur le corpus de données REPERE ont mis en évidence les gains d'une fusion au niveau des espaces de représentations par rapport aux méthodes de fusion tardive standard. ABSTRACT Multimodal embedding fusion for robust speaker role recognition in video broadcast Person role recognition in video broadcasts consists in classifying people into roles such as anchor, journalist, guest, etc. Existing approaches mostly consider one modality, either audio (speaker role recognition) or image (shot role recognition), firstly because of the non-synchrony between both modalities, and secondly because of the lack of a video corpus annotated in both modalities. Deep Neural Networks (DNN) approaches offer the ability to learn simultaneously feature representations (embeddings) and classification functions. This paper presents a multimodal fusion of audio, text and image embeddings spaces for speaker role recognition in asynchronous data. Monomodal embeddings are trained on exogenous data and fine-tuned using a DNN on 70 hours of French Broadcasts corpus for the target task. Experiments on the REPERE corpus show the benefit of the embeddings level fusion compared to the monomodal embeddings systems and to the standard late fusion method. MOTS-CLÉS : Identification du rôle du locuteur, fusion multimodale, émissions de télévision.
- Published
- 2016
129. Joint syntactic and semantic analysis with a multitask Deep Learning Framework for Spoken Language Understanding
- Author
-
Benoit Favre, Frédéric Béchet, Thierry Artières, Jeremie Tafforeau, Laboratoire d'informatique Fondamentale de Marseille (LIF), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), éQuipe AppRentissage et MultimediA [Marseille] (QARMA), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), ISCA, Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU)-Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), and Favre, Benoit
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Computer science ,Speech recognition ,Semantic analysis (machine learning) ,02 engineering and technology ,[INFO] Computer Science [cs] ,computer.software_genre ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,030507 speech-language pathology & audiology ,03 medical and health sciences ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,0202 electrical engineering, electronic engineering, information engineering ,[INFO]Computer Science [cs] ,ComputingMilieux_MISCELLANEOUS ,business.industry ,Deep learning ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,020201 artificial intelligence & image processing ,Artificial intelligence ,0305 other medical science ,Joint (audio engineering) ,business ,computer ,Natural language processing ,Spoken language - Abstract
International audience; no abstract
- Published
- 2016
130. Speech onset latencies as an online measure of regularity extraction
- Author
-
Louisa Bogaerts, Ana Franco, Benoit Favre, Arnaud Rey, Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Laboratoire de psychologie cognitive (LPC), Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Aix Marseille Université (AMU)-Centre National de la Recherche Scientifique (CNRS), and Favre, Benoit
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,[INFO]Computer Science [cs] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO] Computer Science [cs] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; no abstract
- Published
- 2016
131. Beyond utterance extraction: summary recombination for speech summarization
- Author
-
Frédéric Béchet, Benoit Favre, Jérémy Trione, Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), and Favre, Benoit
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,business.industry ,Computer science ,Speech recognition ,02 engineering and technology ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO] Computer Science [cs] ,16. Peace & justice ,computer.software_genre ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Speech summarization ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,[INFO]Computer Science [cs] ,Artificial intelligence ,business ,computer ,Natural language processing ,Utterance ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience; no abstract
- Published
- 2016
132. Détection de concepts pertinents pour le résumé automatique de conversations par recombinaison de patrons
- Author
-
Jérémy Trione, Benoit Favre, Frédéric Béchet, Favre, Benoit, Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), and Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,[INFO]Computer Science [cs] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO] Computer Science [cs] ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; no abstract
- Published
- 2016
133. A Document Repository for Social Media and Speech Conversations
- Author
-
Adam Funk, Gaizauskas, R., Favre, B., Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Favre, Benoit, University of Sheffield [Sheffield], and Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,Document repository ,social media ,REST service ,[INFO]Computer Science [cs] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO] Computer Science [cs] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; We present a successfully implemented document repository REST service for flexible SCRUD (search, create, read, update, delete) storage of social media and speech conversations, using a GATE/TIPSTER-like document object model and providing a query language for document features. This software is currently being used in the SENSEI research project and will be published as open-source software before the project ends. It is, to the best of our knowledge, the first freely available, general purpose data repository to support large-scale multimodal (i.e., speech or text) conversation analytics.
- Published
- 2016
134. 'speech is silver, but silence is golden': improving speech-to-speech translation performance by slashing users input
- Author
-
Benoit Favre, Frédéric Béchet, Mickael Rouvier, Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI, Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Favre, Benoit, Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), and Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Matching (statistics) ,Vocabulary ,Machine translation ,Computer science ,media_common.quotation_subject ,Speech recognition ,02 engineering and technology ,computer.software_genre ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,Task (project management) ,030507 speech-language pathology & audiology ,03 medical and health sciences ,0202 electrical engineering, electronic engineering, information engineering ,Quality (business) ,Dialog box ,Dialog system ,ComputingMilieux_MISCELLANEOUS ,media_common ,business.industry ,020206 networking & telecommunications ,Usability ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,13. Climate action ,0305 other medical science ,business ,computer - Abstract
Speech-to-speech translation is a challenging task mixing two of the most ambitious Natural Language Processing challenges: Machine Translation (MT) and Automatic Speech Recognition (ASR). Recent advances in both fields have led to operational systems achieving good performance when used in matching conditions with those of ASR and MT models training. Regardless of the quality of these models, errors are inevitable due to some technical limitations of the systems (e.g. closed vocabulary) and intrinsic ambiguities of spoken languages. However all ASR and MT errors don’t have the same impact on the usability of a given speech-to-speech dialog system: some can be very benign, unconsciously corrected by users, some can damage the understanding between users and eventually lead the dialog to a failure. We present in this paper a strategy focusing on ASR error segments that have a high negative impact on MT performance. We propose a method that consists firstly in automatically detecting these erroneous segments then secondly estimating their impact on MT. We show that removing such segments prior to translation can lead to a significant decrease in translation error rate, even without any correction strategy.
- Published
- 2015
135. Adapting lexical representation and OOV handling from written to spoken language with word embedding
- Author
-
Frédéric Béchet, Thierry Artières, Jeremie Tafforeau, Benoit Favre, Favre, Benoit, Laboratoire d'informatique Fondamentale de Marseille (LIF), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Université Pierre et Marie Curie - Paris 6 (UPMC), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), and Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU)
- Subjects
Space (punctuation) ,Conditional random field ,Word embedding ,Computer science ,Generalization ,02 engineering and technology ,computer.software_genre ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Transcription (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,Adaptation (computer science) ,ComputingMilieux_MISCELLANEOUS ,business.industry ,020206 networking & telecommunications ,ComputingMethodologies_PATTERNRECOGNITION ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,Embedding ,Artificial intelligence ,0305 other medical science ,business ,computer ,Word (computer architecture) ,Natural language processing ,Spoken language - Abstract
Word embeddings have become ubiquitous in NLP, especially when using neural networks. One of the assumptions of such representations is that words with similar properties have similar representation, allowing for better generalization from subsequent models. In the standard setting, two kinds of training corpora are used: a very large unlabeled corpus for learning the word embedding representations; and an in-domain training corpus with gold labels for training classifiers on the target NLP task. Because of the amount of data required to learn embeddings, they are trained on large corpus of written text. This can be an issue when dealing with non-canonical language, such as spontaneous speech: embeddings have to be adapted to fit the particularities of spoken transcriptions. However the adaptation corpus available for a given speech application can be limited, resulting in a high number of words from the embedding space not occurring in the adaptation space. We present in this paper a method for adapting an embedding space trained on written text to a spoken corpus of limited size. In particular we deal with words from the embedding space not occurring in the adaptation data. We report experiments done on a Part-OfSpeech task on spontaneous speech transcriptions collected in a call-centre. We show that our word embedding adaptation approach outperforms state-of-the-art Conditional Random Field approach when little in-domain adaptation data is available.
- Published
- 2015
136. Détection et caractérisation d’erreurs dans des transcriptions automatiques pour des systèmes de traduction parole-parole
- Author
-
Frédéric Béchet, Benoit Favre, Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), and Favre, Benoit
- Subjects
[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] - Abstract
International audience; no abstract
- Published
- 2014
137. Correction interactive de transcriptions de parole par fusion de phrases
- Author
-
Mickael Rouvier, Benoit Favre, Frédéric Béchet, Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI, Laboratoire d'informatique Fondamentale de Marseille (LIF), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), and Favre, Benoit
- Subjects
[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] - Abstract
International audience; no abstract
- Published
- 2014
138. Automatically enriching spoken corpora with syntactic information for linguistic studies
- Author
-
Alexis Nasr, Frédéric Béchet, Benoit Favre, Thierry Bazillon, José Deulofeu, André Valli, Laboratoire d'informatique Fondamentale de Marseille (LIF), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Informatique de l'Université du Maine (LIUM), Le Mans Université (UM)-Centre National de la Recherche Scientifique (CNRS), DEscription Linguistique Informatisée sur Corpus (DELIC), Université de Provence - Aix-Marseille 1, Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), and Favre, Benoit
- Subjects
[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] - Abstract
International audience; no abstract
- Published
- 2014
139. Scene understanding for identifying persons in TV shows: beyond face authentication
- Author
-
Géraldine Damnati, Delphine Charlet, Meriem Bendris, Benoit Favre, Mickael Rouvier, Favre, Benoit, Laboratoire d'informatique Fondamentale de Marseille (LIF), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), France Télécom Recherche & Développement (FT R&D), France Télécom, France Télécom Recherche et Développement [Lannion] (FTR&D), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI, Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), and Centre d'Enseignement et de Recherche en Informatique - CERI-Avignon Université (AU)
- Subjects
Focus (computing) ,Authentication ,Exploit ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,Identification (information) ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,Face (geometry) ,Three-dimensional face recognition ,Computer vision ,Artificial intelligence ,Face detection ,business ,Baseline (configuration management) ,ComputingMilieux_MISCELLANEOUS - Abstract
Our goal is to automatically identify people in TV news and debates without any predefined dictionary of people. In this paper, we focus on the problem of person identification beyond face authentication in order to improve the identification results and not only where the face is detectable. We propose to use automatic scene analysis as features for people identification. We exploit two features: scene classification (studio and report) and camera identification. Then, people are identified by propagation strategies of overlaid names (OCR results) and speakers to scene classes and specific camera shots. Experiments performed on the REPERE corpus show improvement of face identification using scene understanding features (+13.9% of F-measure compared to the baseline).
- Published
- 2014
140. Joint Decoding of Complementary Utterances
- Author
-
Benoit Favre, Mickael Rouvier, Frédéric Béchet, Favre, Benoit, Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), and Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Computer science ,business.industry ,Speech recognition ,Probabilistic logic ,020206 networking & telecommunications ,0102 computer and information sciences ,02 engineering and technology ,Translation (geometry) ,computer.software_genre ,01 natural sciences ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,Task (project management) ,010201 computation theory & mathematics ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,Joint (audio engineering) ,business ,computer ,Decoding methods ,Word (computer architecture) ,Natural language processing ,Utterance ,ComputingMilieux_MISCELLANEOUS - Abstract
Errors in open-domain ASR can be corrected by asking the speaker to rephrase targeted segments in utterances where they have been detected. The utterance merging problem consists in generating a better transcript from the utterance where errors have been detected and a clarification utterance. We introduce an alignment-decoding algorithm for jointly processing the two utterances and benefit from the complementary information they contain. The algorithm aligns word lattices in the WFST framework with a probabilistic cost model. Results on the BOLT-BC speech-to-speech translation task show an improvement of 2.84 points of accuracy compared to aligning the one best without joint decoding.
- Published
- 2014
141. Retrieving the syntactic structure of erroneous ASR transcriptions for open-domain Spoken Language Understanding
- Author
-
Benoit Favre, Mathieu Morey, Alexis Nasr, Frédéric Béchet, Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille (LIF), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Semantic Analysis of Natural Language (SEMAGRAMME), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Favre, Benoit, Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)
- Subjects
Structure (mathematical logic) ,Parsing ,Point (typography) ,Computer science ,Process (engineering) ,business.industry ,Speech recognition ,Word error rate ,Automatic Speech Recognition ,computer.software_genre ,Syntax ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,Transcription (linguistics) ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,Confidence Measures ,Dependency grammar ,Dependency Parsing ,Spoken Language Understanding ,Syntactic structure ,Artificial intelligence ,business ,computer ,Natural language processing ,Spoken language - Abstract
International audience; Retrieving the syntactic structure of erroneous ASR transcriptions can be of great interest for open-domain Spoken Language Understanding tasks in order to correct or at least reduce the impact of ASR errors on final applications. Most of the previous works on ASR and syntactic parsing have addressed this problem by using syntactic features during ASR to help reducing Word Error Rate (WER). The improvement obtained is rather small however the structure and the relations between words obtained through parsing can be of great interest for the SLU processes, even without a significant decrease of WER. That is why we adopt another point of view in this paper: considering that ASR transcriptions contain inevitably some errors, we show in this study that it is possible to improve the syntactic analysis of these erroneous transcriptions by performing a joint error detection / syntactic parsing process. The applicative framework used in this study is a speech-to-speech system developed through the DARPA BOLT project.
- Published
- 2014
142. Speaker adaptation of DNN-based ASR with i-vectors: Does it actually adapt models to speakers?
- Author
-
Mickael Rouvier, Benoit Favre, Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI, Laboratoire d'informatique Fondamentale de Marseille (LIF), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Favre, Benoit, Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), and Centre d'Enseignement et de Recherche en Informatique - CERI-Avignon Université (AU)
- Subjects
Computer science ,Speech recognition ,020206 networking & telecommunications ,02 engineering and technology ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,030507 speech-language pathology & audiology ,03 medical and health sciences ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,0202 electrical engineering, electronic engineering, information engineering ,0305 other medical science ,Adaptation (computer science) ,Cluster analysis ,ComputingMilieux_MISCELLANEOUS ,Speaker adaptation - Abstract
Deep neural networks (DNN) are currently very successful for acoustic modeling in ASR systems. One of the main challenges with DNNs is unsupervised speaker adaptation from an initial speaker clustering, because DNNs have a very large number of parameters. Recently, a method has been proposed to adapt DNNs to speakers by combining speaker-specific information (in the form of i-vectors computed at the speaker-cluster level) with fMLLR-transformed acoustic features. In this paper we try to gain insight on what kind of adaptation is performed on DNNs when stacking i-vectors with acoustic features and what information exactly is carried by i-vectors. We observe on REPERE corpus that DNNs trained on i-vector features concatenated with fMLLR-transformed acoustic features lead to a gain of 0.7 points. The experiments shows that using ivector stacking in DNN acoustic models is not only performing speaker adaptation, but also adaptation to acoustic conditions.
- Published
- 2014
143. Adapting dependency parsing to spontaneous speech for open domain spoken language understanding
- Author
-
Alexis Nasr, Benoit Favre, Frédéric Béchet, Favre, Benoit, Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), and Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Computer science ,Speech recognition ,media_common.quotation_subject ,02 engineering and technology ,computer.software_genre ,Semantics ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Rule-based machine translation ,Discriminative model ,Dependency grammar ,0202 electrical engineering, electronic engineering, information engineering ,Conversation ,ComputingMilieux_MISCELLANEOUS ,media_common ,Parsing ,business.industry ,020206 networking & telecommunications ,Syntax ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,Artificial intelligence ,FrameNet ,Transcription (software) ,0305 other medical science ,business ,computer ,Natural language processing ,Spoken language - Abstract
Parsing human-human conversations consists in automatically enriching text transcription with semantic structure information. We use in this paper a FrameNet-based approach to semantics that, without needing a full semantic parse of a message, goes further than a simple flat translation of a message into basic concepts. FrameNet-based semantic parsing may follow a syntactic parsing step, however spoken conversations in customer service telephone call centers present very specific characteristics such as non-canonical language, noisy messages (disfluencies, repetitions, truncated words or automatic speech transcription errors) and the presence of superfluous information. For syntactic parsing the traditional view based on context-free grammars is not suitable for processing non-canonical text. New approaches to parsing based on dependency structures and discriminative machine learning techniques are more adapted to process spontaneous speech for two main reasons: (a) they need less training data and (b) the annotation with syntactic dependencies of conversation transcripts is simpler than with syntactic constituents. Another advantage is that partial annotation can be performed. This paper presents the adaptation of a syntactic dependency parser to process very spontaneous speech recorded in a callcentre environment. This parser is used in order to produce FrameNet candidates for characterizing conversations between an operator and a caller.
- Published
- 2014
144. Automatic human utility evaluation of ASR systems: does WER really predict performance?
- Author
-
Cosmin Munteanu, Adam Lee, Ani Nenkova, Stephen Tratz, Kyla Cheung, Siavash Kazemian, Clare R. Voss, Benoit Favre, Dennis Ochei, Yang Liu, Frauke Zeller, Gerald Penn, Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Columbia University [New York], University of Toronto, Department of Chemistry [York, UK], University of York [York, UK], City University of New York [New York] (CUNY), Department of Computer Science [Dallas] (University of Texas at Dallas), University of Texas at Dallas [Richardson] (UT Dallas), University of Pennsylvania [Philadelphia], US Army Research Laboratory-CIS Directorate (ARL), United States Army (U.S. Army), University College of London [London] (UCL), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), University of Pennsylvania, and Favre, Benoit
- Subjects
Computer science ,Speech recognition ,Word error rate ,020206 networking & telecommunications ,02 engineering and technology ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,Task (project management) ,030507 speech-language pathology & audiology ,03 medical and health sciences ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,0202 electrical engineering, electronic engineering, information engineering ,Metric (unit) ,0305 other medical science ,ComputingMilieux_MISCELLANEOUS - Abstract
We propose an alternative evaluation metric to Word Error Rate (WER) for the decision audit task of meeting recordings, which exemplifies how to evaluate speech recognition within a legitimate application context. Using machine learning on an initial seed of human-subject experimental data, our alternative metric handily outperforms WER, which correlates very poorly with human subjects’ success in finding decisions given ASR transcripts with a range of WERs.
- Published
- 2013
145. ASR Error Segment Localization for Spoken Recovery Strategy
- Author
-
Benoit Favre, Frédéric Béchet, Favre, Benoit, Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), and Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Voice activity detection ,Computer science ,business.industry ,Speech recognition ,05 social sciences ,Word error rate ,02 engineering and technology ,Speech processing ,computer.software_genre ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Artificial intelligence ,Dialog system ,business ,computer ,050107 human factors ,Natural language processing ,ComputingMilieux_MISCELLANEOUS - Abstract
Even though small ASR errors might not impact downstream processes that make use of the transcript, larger error segments like those generated by OOVs can have a considerable impact on applications such as speech-to-speech translation and can eventually lead to communication failure between users of the system. This work focuses on error detection in ASR output targeted towards significant error segments that can be recovered using a dialog system. We propose a CRF system trained to recognize error segments with ASR confidence-based, lexical and syntactic features. The most significant error segment is passed to a dialog system for interactive recovery in which rephrased words are reinserted in the original. 22% of utterances can be fully recovered and an interesting by-product is that rewriting error segments as a single token reduces WER by 17% on an adverse corpus.
- Published
- 2013
146. Understand the Global Economic Crisis: A Text Summarization Approach
- Author
-
Shuhua Liu, Benoit Favre, Arcada University of Applied Sciences, University of Texas at Dallas [Richardson] (UT Dallas), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), and Favre, Benoit
- Subjects
[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] - Abstract
International audience; no abstract
- Published
- 2013
147. Generative Constituent Parsing and Discriminative Dependency Reranking: Experiments on English and French
- Author
-
Joseph Le Roux, Benoit Favre, Alexis Nasr, Seyed Abolghasem Mirroshandel, Université Paris 13 (UP13), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Laboratoire d'Informatique de Paris-Nord (LIPN), Université Paris 13 (UP13)-Institut Galilée-Université Sorbonne Paris Cité (USPC)-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), ANR-08-EMER-0013,SEQUOIA,Analyse syntaxique probabiliste à large couverture du français(2008), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Le Roux, Joseph, Analyse syntaxique probabiliste à large couverture du français - - SEQUOIA2008 - ANR-08-EMER-0013 - DEFIS - VALID, and Favre, Benoit
- Subjects
TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] - Abstract
International audience; We present an architecture for parsing in two steps. A phrase-structure parser builds for each sentence an n-best list of analyses which are converted to dependency trees. These de- pendency structures are then rescored by a dis- criminative reranker. Our method is language agnostic and enables the incorporation of ad- ditional information which are useful for the choice of the best parse candidate. We test our approach on the the Penn Treebank and the French Treebank. Evaluation shows a sig- nificative improvement on different parse met- rics.
- Published
- 2012
148. Syntactic annotation of spontaneous speech: application to call-center conversation data
- Author
-
Thierry Bazillon, Melanie Delplano, Frédéric Béchet, Alexis Nasr, Benoit Favre, Favre, Benoit, Laboratoire d'Informatique de l'Université du Maine (LIUM), Le Mans Université (UM)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), and Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience; no abstract
- Published
- 2012
149. Detecting Person Presence in TV Shows with Linguistic and Structural Features
- Author
-
Géraldine Damnati, Frédéric Béchet, Benoit Favre, Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), France Télécom Recherche et Développement [Lannion] (FTR&D), France Télécom, Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), and Favre, Benoit
- Subjects
Focus (computing) ,Image fusion ,Boosting (machine learning) ,business.industry ,Computer science ,Speech recognition ,Feature extraction ,Frame (networking) ,Context (language use) ,Pragmatics ,computer.software_genre ,Linguistics ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,030507 speech-language pathology & audiology ,03 medical and health sciences ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,Face (geometry) ,Artificial intelligence ,0305 other medical science ,business ,computer ,Natural language processing ,ComputingMilieux_MISCELLANEOUS - Abstract
Person detection and recognition in videos is a hard problem due to the intrinsic ambiguities of the sound and image channels and their interaction. Whatever method is used to extract person hypotheses from the audio or the image channels, person recognition in videos relies on a multimodal decision process that merges the different hypotheses produced in order to decide, for each frame, who is present in the video at the audio level, at the image level or at the content level (person mention in speech or inserted text boxes). In this framework the focus of this paper is to produce a list of person presence hypotheses from the audio channel of a video document only, to be used in addition to person presence detected at the image level by a multimodal fusion process. In this study we focus on the audio channel only, using two kinds of features: linguistic features corresponding to the way a person is mentioned by a speaker; structural features corresponding to the context of occurrence of a name in a show. We show that both sets of features are complementary and that good results can be achieved on a TV show corpus annotated with person presence labels.
- Published
- 2012
150. Modèles génératif et discriminant en analyse syntaxique : expériences sur le corpus arboré de Paris 7
- Author
-
Joseph Le Roux, Benoit Favre, Seyed Abolghasem Mirroshandel, Alexis Nasr, Université Paris 13 (UP13), Laboratoire d'informatique Fondamentale de Marseille - UMR 6166 (LIF), Université de la Méditerranée - Aix-Marseille 2-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'informatique Fondamentale de Marseille (LIF), Centre National de la Recherche Scientifique (CNRS)-École Centrale de Marseille (ECM)-Aix Marseille Université (AMU), Le Roux, Joseph, Analyse syntaxique probabiliste à large couverture du français - - SEQUOIA2008 - ANR-08-EMER-0013 - DEFIS - VALID, Traitement Automatique du Langage Ecrit et Parlé (TALEP), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), ANR-08-EMER-0013,SEQUOIA,Analyse syntaxique probabiliste à large couverture du français(2008), Favre, Benoit, and Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
corpus arboré= ,réordonnancement discriminant ,apprentissage automatique ,[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] ,ComputingMilieux_MISCELLANEOUS ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,analyse syntaxique - Abstract
International audience; Nous présentons une architecture pour l'analyse syntaxique en deux étapes. Dans un premier temps un analyseur syntagmatique construit, pour chaque phrase, une liste d'analyses qui sont converties en arbres de dépendances. Ces arbres sont ensuite réévalués par un réordonnanceur discriminant. Cette méthode permet de prendre en compte des informations auxquelles l'analyseur n'a pas accès, en particulier des annotations fonction- nelles. Nous validons notre approche par une évaluation sur le corpus arboré de Paris 7. La seconde étape permet d'améliorer significativement la qualité des analyses retournées, quelle que soit la métrique utilisée.
- Published
- 2011
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.