16 results on '"WSD"'
Search Results
2. Verifying Usefulness of Algorithms for WordNet Based Similarity Sense Disambiguation
- Author
-
Kukla, Elżbieta, Siemiński, Andrzej, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Choroś, Kazimierz, editor, Kopel, Marek, editor, Kukla, Elżbieta, editor, and Siemiński, Andrzej, editor
- Published
- 2019
- Full Text
- View/download PDF
3. Practice of Word Sense Disambiguation
- Author
-
Siemiński, Andrzej, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Nguyen, Ngoc Thanh, editor, Hoang, Duong Hung, editor, Hong, Tzung-Pei, editor, Pham, Hoang, editor, and Trawiński, Bogdan, editor
- Published
- 2018
- Full Text
- View/download PDF
4. Word Sense Disambiguation Using IndoWordNet
- Author
-
Bhingardive, Sudha, Bhattacharyya, Pushpak, Dash, Niladri Sekhar, editor, Bhattacharyya, Pushpak, editor, and Pawar, Jyoti D., editor
- Published
- 2017
- Full Text
- View/download PDF
5. Annotating Words Using WordNet Semantic Glosses
- Author
-
Szymański, Julian, Duch, Włodzisław, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Huang, Tingwen, editor, Zeng, Zhigang, editor, Li, Chuandong, editor, and Leung, Chi Sing, editor
- Published
- 2012
- Full Text
- View/download PDF
6. Word Sense Disambiguation using Aggregated Similarity based on WordNet Graph Representation
- Author
-
Mădălina ZURINI
- Subjects
WSD ,Similarity Measure ,WordNet ,Ontology ,Synset ,Computer engineering. Computer hardware ,TK7885-7895 ,Bibliography. Library science. Information resources - Abstract
The term of word sense disambiguation, WSD, is introduced in the context of text document processing. A knowledge based approach is conducted using WordNet lexical ontology, describing its structure and components used for the process of identification of context related senses of each polysemy words. The principal distance measures using the graph associated to WordNet are presented, analyzing their advantages and disadvantages. A general model for aggregation of distances and probabilities is proposed and implemented in an application in order to detect the context senses of each word. For the non-existing words from WordNet, a similarity measure is used based on probabilities of co-occurrences. The module of WSD is proposed for integration in the step of processing documents such as supervised and unsupervised classification in order to maximize the correctness of the classification. Future work is related to the implementation of different domain oriented ontologies.
- Published
- 2013
- Full Text
- View/download PDF
7. Word Sense Disambiguation using Aggregated Similarity based on WordNet Graph Representation.
- Author
-
ZURINI, Mădălina
- Subjects
POLYSEMY ,SEMANTICS ,PROBABILITY theory ,ONTOLOGY ,DOCUMENT markup languages ,TEXT processing (Computer science) - Abstract
The term of word sense disambiguation, WSD, is introduced in the context of text document processing. A knowledge based approach is conducted using WordNet lexical ontology, describing its structure and components used for the process of identification of context related senses of each polysemy words. The principal distance measures using the graph associated to WordNet are presented, analyzing their advantages and disadvantages. A general model for aggregation of distances and probabilities is proposed and implemented in an application in order to detect the context senses of each word. For the non-existing words from WordNet, a similarity measure is used based on probabilities of co-occurrences. The module of WSD is proposed for integration in the step of processing documents such as supervised and unsupervised classification in order to maximize the correctness of the classification. Future work is related to the implementation of different domain oriented ontologies. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
8. Multilingual versus monolingual word sense disambiguation.
- Author
-
Ion, Radu and Tufiş, Dan
- Subjects
MULTILINGUALISM ,MONOLINGUALISM ,LANGUAGE & languages ,VOCABULARY ,LEARNING - Abstract
This article describes two different word sense disambiguation (WSD) systems, one applicable to parallel corpora and requiring aligned wordnets and the other one, knowledge poorer, albeit more relevant for real applications, relying on unsupervised learning methods and only monolingual data (text and wordnet). Comparing performances of word sense disambiguation systems is a very difficult evaluation task when different sense inventories are used and even more difficult when the sense distinctions are not of the same granularity. However, as we used the same sense inventory, the performance of the two WSD systems can be objectively compared and we bring evidence that multilingual WSD is more precise than monolingual WSD. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
9. Integrating Linguistic Resources in TC through WSD.
- Author
-
Ureña-López, L. Alfonso, Buenaga, Manuel, and Gómez, José M.
- Subjects
- *
LINGUISTICS , *LANGUAGE & languages , *DATABASES , *INFORMATION storage & retrieval systems , *ELECTRONIC data processing , *COMPUTERS - Abstract
Information access methods must be improved to overcome the information overload that most professionals face nowadays. Text classification tasks, like Text Categorization, help the users to access to the great amount of text they find in the Internet and their organizations. TC is the classification of documents into a predefined set of categories. Most approaches to automatic TC are based on the utilization of a training collection, which is a set of manually classified documents. Other linguistic resources that are emerging, like lexical databases, can also be used for classification tasks. This article describes an approach to TC based on the integration of a training collection (Reuters-21578) and a lexical database (WordNet 1.6) as knowledge sources. Lexical databases accumulate information on the lexical items of one or several languages. This information must be filtered in order to make an effective use of it in our model of TC. This filtering process is a Word Sense Disambiguation task. WSD is the identification of the sense of words in context. This task is an intermediate process in many natural language processing tasks like machine translation or multilingual information retrieval. We present the utilization of WSD as an aid for TC. Our approach to WSD is also based on the integration of two linguistic resources: a training collection (SemCor and Reuters-21578) and a lexical database (WordNet 1.6). We have developed a series of experiments that show that: TC and WSD based on the integration of linguistic resources are very effective; and, WSD is necessary to effectively integrate linguistic resources in TC. [ABSTRACT FROM AUTHOR]
- Published
- 2001
- Full Text
- View/download PDF
10. Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources
- Author
-
Marlena Orlinska, Maciej Piasecki, and Paweł Kędzia
- Subjects
Linguistics and Language ,Computer Networks and Communications ,Computer science ,WordNet ,Scale (descriptive set theory) ,Ontology (information science) ,computer.software_genre ,Lexicon ,lcsh:P325-325.5 ,WSD ,page rank ,Structure (mathematical logic) ,graphs ,business.industry ,Communication ,lexical resources ,lcsh:P98-98.5 ,Part of speech ,plWordNet ,lcsh:Lexicography ,word sense disambiguation ,SUMO ,Artificial intelligence ,lcsh:Computational linguistics. Natural language processing ,business ,computer ,lcsh:P327-327.5 ,Natural language processing ,Natural language ,Word (computer architecture) ,lcsh:Semantics - Abstract
Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical ResourcesLexical resources can be applied in many different Natural Language Engineering tasks, but the most fundamental task is the recognition of word senses used in text contexts. The problem is difficult, not yet fully solved and different lexical resources provided varied support for it. Polish CLARIN lexical semantic resources are based on the plWordNet — a very large wordnet for Polish — as a central structure which is a basis for linking together several resources of different types. In this paper, several Word Sense Disambiguation (henceforth WSD) methods developed for Polish that utilise plWordNet are discussed. Textual sense descriptions in the traditional lexicon can be compared with text contexts using Lesk’s algorithm in order to find best matching senses. In the case of a wordnet, lexico-semantic relations provide the main description of word senses. Thus, first, we adapted and applied to Polish a WSD method based on the Page Rank. According to it, text words are mapped on their senses in the plWordNet graph and Page Rank algorithm is run to find senses with the highest scores. The method presents results lower but comparable to those reported for English. The error analysis showed that the main problems are: fine grained sense distinctions in plWordNet and limited number of connections between words of different parts of speech. In the second approach plWordNet expanded with the mapping onto the SUMO ontology concepts was used. Two scenarios for WSD were investigated: two step disambiguation and disambiguation based on combined networks of plWordNet and SUMO. In the former scenario, words are first assigned SUMO concepts and next plWordNet senses are disambiguated. In latter, plWordNet and SUMO are combined in one large network used next for the disambiguation of senses. The additional knowledge sources used in WSD improved the performance. The obtained results and potential further lines of developments were discussed.
- Published
- 2015
11. Uso de representaciones vectoriales de las palabras para la detección de dobles sentidos (puns)
- Author
-
Carrasco Gómez, Pascual Andrés
- Subjects
Joc de paraules ,Polisemia ,Reconeixement de Formes i Imatge Digital [Máster Universitario en Inteligencia Artificial, Reconocimiento de Formas e Imagen Digital-Màster Universitari en Intel·Ligència Artificial] ,WordNet ,Lenguaje natural ,Desambiguació semàntica ,Puns ,Polisèmia ,Wordplay ,Embeddings ,Llenguatge natural ,Juego de palabras ,Natural language ,Máster Universitario en Inteligencia Artificial, Reconocimiento de Formas e Imagen Digital-Màster Universitari en Intel·Ligència Artificial: Reconeixement de Formes i Imatge Digital ,Semantic disambiguation ,WSD ,LENGUAJES Y SISTEMAS INFORMATICOS ,Polysemy ,Desambiguación semántica - Abstract
Semantic disambiguation or the understanding of natural language are areas within the natural language processing that although they have been widely studied, continue to pose a major challenge. Traditional approaches to semantic disambiguation lie on the assumption that there is a unique and unequivocal semantic underlying each word in a sentence. However, there is a class of constructions in language known as puns, in which lexical-semantic ambiguity is a sought-after effect in the sentence. That is, the speaker or writer pretends that a particular word or other lexical element is interpreted simultaneously with two or more different meanings. In this project we propose to approach the location and disambiguation of double meaning words (puns) in a set of sentences. To do this we will use different vector representations of words obtained from different corpus, and different metrics of similarity will be studied. The data sets belong to the Semeval 2017 international competition and the results can be compared with those published by the competition., La desambiguación semántica o la comprensión del lenguaje natural son ámbitos dentro del procesamiento del lenguaje natural que aunque han sido ampliamente estudiados, siguen suponiendo un reto importante. Los enfoques tradicionales de la desambiguación semántica descansan en la suposición de que existe una única e inequívoca semántica subyacente a cada palabra en una oración. Sin embargo, existe una clase de construcciones en el lenguaje conocidas como juegos de palabras (puns), en los que la ambigüedad léxico-semántica es un efecto buscado en la oración. Es decir, el hablante o escritor pretende que una determinada palabra u otro elemento léxico sea interpretado simultáneamente con dos o más significados distintos. En este proyecto proponemos abordar la localización y desambiguación de palabras con doble sentido (puns) en una serie de oraciones. Para ello usaremos diferentes representaciones vectoriales de las palabras obtenidas a partir de diferentes corpus, y se estudiarán diferentes métricas de similitud. Los conjuntos de datos pertenecen a la competición internacional Semeval 2017 y los resultados se podrán contrastar con los publicados por la competición., [CA] La desambiguació semàntica o la comprensió del llenguatge natural són àmbits dins del processament del llenguatge natural que encara que han estat àmpliament estudiats, segueixen suposant un repte important. Els enfocaments tradicionals de la desambiguació semàntica descansen en la suposició que hi ha una única e inequívoca semàntica subjacent a cada paraula en una oració. No obstant això, hi ha una classe de construccions en el llenguatge conegudes com jocs de paraules (puns), en els quals l’ambigüitat lexicosemàntica és un efecte buscat en l’oració. És a dir, el parlant o escriptor pretén que una determinada paraula o un altre element lèxic sigui interpretat simultàniament amb dos o més significats diferents. En aquest projecte proposem abordar la localització i desambiguació de paraules amb doble sentit (puns) en una sèrie d’oracions. Per a això farem servir diferents representacions vectorials de les paraules obtingudes a partir de diferents corpus, i s’estudiaran diferents mètriques de similitud. Els conjunts de dades pertanyen a la competició internacional Semeval 2017 i els resultats es podran contrastar amb les publicades per la competició.
- Published
- 2017
12. Word Sense Disambiguation using Aggregated Similarity based on WordNet Graph Representation
- Author
-
Mădălina Zurini
- Subjects
lcsh:Computer engineering. Computer hardware ,Computer science ,WordNet ,WSD, Similarity Measure, WordNet, Ontology, Synset ,lcsh:TK7885-7895 ,Similarity measure ,Ontology (information science) ,computer.software_genre ,Synset ,Taxonomy (general) ,Similarity (psychology) ,WSD ,Polysemy ,Ontology ,business.industry ,lcsh:Z ,lcsh:Bibliography. Library science. Information resources ,Similarity Measure ,ComputingMethodologies_PATTERNRECOGNITION ,Knowledge base ,Graph (abstract data type) ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
The term of word sense disambiguation, WSD, is introduced in the context of text document processing. A knowledge based approach is conducted using WordNet lexical ontology, describing its structure and components used for the process of identification of context related senses of each polysemy words. The principal distance measures using the graph associated to WordNet are presented, analyzing their advantages and disadvantages. A general model for aggregation of distances and probabilities is proposed and implemented in an application in order to detect the context senses of each word. For the non-existing words from WordNet, a similarity measure is used based on probabilities of co-occurrences. The module of WSD is proposed for integration in the step of processing documents such as supervised and unsupervised classification in order to maximize the correctness of the classification. Future work is related to the implementation of different domain oriented ontologies.Keywords: WSD, Similarity Measure, WordNet, Ontology, Synset(ProQuest: ... denotes formulae omitted.)1 IntroductionFor the acquisition of knowledge in artificial intelligence, two approaches defined in [1] are used:* transfer process between human to knowledge base, process with a major disadvantage given by the fact that the one who has knowledge cannot easily identify it;* conceptual modeling process by building models in which are placed the new knowledge as they are acquired, this process leading to the appearance of the ontology as a systematic organization of knowledge, data of the reality, leading to the construction of theories upon what it exists.An essential role of ontology is to be reused in multiple applications. Mapping two or more ontologies is called alignment. This task is particularly difficult, the main cause of limitation in extending existing ontologies [1].Direction that follows the ontology is supported by the introduction of artificial intelligence techniques to emulate the mental representation of concepts used, and the interpenetration of these links.The kernel of the ontology is defined as a system 0 = (£, T, C*,dC, ROOT), where:* £ is the lexicon formed out of the terms from the natural language;* C* a set of concepts;* T represents the reference function that maps the set of terms of the lexicon to the set of concepts;* H is the hierarchy of the taxonomy given by the direct, acyclic, transitive and reflexive relation;* ROOT is the starting point upon which the hierarchy is built on.There are two types of ontologies as defined in [1], depending on the area in which they are used:* ontologies for knowledge-based systems are characterized by a relatively small number of concepts, but linked by a large and varied relationships, concepts are grouped into complex conceptual schemes or scenarios and for each concept there can be one or more customizations;* lexicalized ontologies, including a large number of concepts linked by a small number of relationships, like WordNet ontology concepts that are represented by sets of synonymous words, these ontologies are used in human language processing systems.It is introduced the concept of ontology as a knowledge base in the classification of documents, in order to analyze semantic documents by solving the ambiguity of the terms.This integration results in an improvement in the objective function defined for classification techniques used. The main components of an ontology are described, the concepts and relations between them. These components are analyzed, identifying methods of extracting knowledge from within.With the defined relationships between concepts it is created the graph representation seen as a taxonomy of belonging such as "is- a" of the concepts to the more general ones. The senses of a concept are defined, along with the possibility of graph representation of each sense. …
- Published
- 2013
- Full Text
- View/download PDF
13. Word sense disambiguation in webpages. Developing a program capable to disambiguate words with a website text as context
- Author
-
Sekkingstad, Andreas
- Subjects
semantic web ,Wordnet ,WSD ,nlp ,semantikk - Abstract
This master thesis investigated automatic methods of Word Sense Disambiguation (WSD) in HTML pages. The hypothesis was that HTML documents provide various disambiguation cues which are not normally present in general text, and which can enhance the quality of WSD. We tested several existing natural language processing toolkits which provide general WSD services, and compared these to our novel algorithms which were designed to take advantage of the HTML cues. The findings showed that our new algorithms outperformed state of the art general WSD implementations. In addition, our algorithm could provide a ranked list of potential disambiguations, which is useful in an example use case where users “tag” key words in a web page with the help of the disambiguating algorithm. INFO390 MASV-INFO
- Published
- 2016
14. Annotating Words Using WordNet Semantic Glosses
- Author
-
Duch, Włodzisław and Szymański, Julian
- Subjects
WordNet ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,WSD ,Word Sense Disambiguation ,NLP ,Wikipedia - Abstract
An approach to the word sense disambiguation (WSD) relaying on the WordNet synsets is proposed. The method uses semantically tagged glosses to perform a process similar to the spreading activation in semantic network, creating ranking of the most probable meanings for word annotation. Preliminary evaluation shows quite promising results. Comparison with the state-of-theart WSD methods indicates that the use of WordNet relations and semantically tagged glosses should enhance accuracy of word disambiguation methods.
- Published
- 2012
15. Regular Polysemy in WordNet
- Author
-
Barque, Lucie, Chaumartin, François-Régis, Langues, textes, traitement informatique, cognition (LaTTice), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Université Paris Diderot - Paris 7 (UPD7)-Centre National de la Recherche Scientifique (CNRS), Proxem, Analyse Linguistique Profonde à Grande Echelle, Large-scale deep linguistic processing (ALPAGE), Inria Paris-Rocquencourt, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université Paris Diderot - Paris 7 (UPD7), and École normale supérieure - Paris (ENS-PSL)
- Subjects
regular polysemy ,ComputingMethodologies_PATTERNRECOGNITION ,WordNet ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,metonymy ,WSD ,[SHS.LANGUE]Humanities and Social Sciences/Linguistics ,metaphor ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] - Abstract
International audience; The importance of describing regular polysemy in a lexicon has often been outlined, especially in the field of natural language processing (for a good overview of this issue, see (Ravin and Leacock, 2000)). Unfortunately, no existing broad-coverage semantic lexicon has been built following this relatively recent advice. And since producing a broad coverage semantic lexicon is a very time-consuming task, one has to put this idea into practice on existing lexicons. WordNet is an appropriate lexical semantic resource for running this experiment as it is machine readable and has a wide coverage (Fellbaum, 1998). In this paper, we introduce a method to create regular polysemy patterns from WordNet data and to automatically detect their occurrences in the lexicon.
- Published
- 2009
16. Aprendizaje competitivo LVQ para la desambiguación léxica
- Author
-
García Vega, Manuel, Martín Valdivia, María Teresa, and Ureña López, Luis Alfonso
- Subjects
SENSEVAL ,Competitive learning ,WordNet ,Neural nets ,LVQ ,Redes neuronales ,WSD ,Aprendizaje competitivo ,SemCor - Abstract
La resolución de la ambigüedad léxica mejora significativamente muchas tareas del procesamiento del lenguaje natural. Presentamos un desambiguador supervisado basado en el Modelo de Espacio Vectorial en el que sus pesos se entrenan con un algoritmo competitivo basado en el modelo de Kohonen, concretamente el LVQ. Para ello, hace uso de las distintas relaciones semánticas de WordNet y también del corpus SemCor. El desambiguador se evalúa haciendo una simulación de participación en la competición SENSEVAL-2. Como muestran los resultados, la posición obtenida es muy buena. Word Sense Disambiguation improves several tasks of Natural Language Processing. We present a supervised disambiguator based on Vector Space Model, where its weights are trained with a learning vector quantization algorithm based on the Kohonen Model (LVQ algorithm) and using different semantic relations of WordNet and SemCor corpus. We also include an evaluation making a simulation of participation in SENSEVAL-2, obtaining a good position. Este trabajo ha sido financiado por el MCYT mediante el proyecto FIT-150500-2003-412.
- Published
- 2003
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.