1. Computing Trees of Named Word Usages from a Crowdsourced Lexical Network
- Author
-
Alain Joubert, Mathieu Lafourcade, Exploration et exploitation de données textuelles (TEXTE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), and Lafourcade, Mathieu
- Subjects
Root (linguistics) ,Computer science ,business.industry ,media_common.quotation_subject ,05 social sciences ,Word processing ,[INFO.INFO-TT] Computer Science [cs]/Document and Text Processing ,02 engineering and technology ,Ambiguity ,computer.software_genre ,050105 experimental psychology ,Term (time) ,[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing ,Tree (data structure) ,Node (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Word usage ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Artificial intelligence ,business ,computer ,Word (computer architecture) ,Natural language processing ,media_common - Abstract
Thanks to the participation of a large number of persons via web-based games, a large-sized evolutionary lexical network is available for French. With this resource, we approached the question of the determination of the word usages of a term, and then we introduced the notion of similarity between these various word usages. So, we were able to build for a term its word usage tree: the root groups together all possible usages of this term and a search in the tree corresponds to a refinement of these word usages. The labelling of the various nodes of the word usage tree of a term is made during a width-first search: the root is labelled by the term itself and each node of the tree is labelled by a term stemming from the clique or quasi-clique this node represents. We show on a precise example that it is possible that some nodes of the tree, often leaves, cannot be labelled without ambiguity. This paper ends with an evaluation about word usages detected in our lexical network.
- Published
- 2010
- Full Text
- View/download PDF