Back to Search Start Over

Improving selection of synsets from WordNet for domain-specific word sense disambiguation.

Authors :
Lopez-Arevalo, Ivan
Sosa-Sosa, Victor J.
Rojas-Lopez, Franco
Tello-Leal, Edgar
Source :
Computer Speech & Language. Jan2017, Vol. 41, p128-145. 18p.
Publication Year :
2017

Abstract

Word Sense Disambiguation (WSD) is a fundamental task useful for Information Retrieval, Information Extraction, web search, and indexing, among others. In the literature there exist several works dedicated to generic WSD task, but in recent years domain-specific WSD has attracted the attention of several researchers. In this sense, this paper describes an approach for domain-specific WSD by selecting the predominant sense (synset from WordNet) of ambiguous words. To achieve it the method uses two corpora: the domain-specific test corpus (containing target ambiguous words) and a domain-specific auxiliary corpus (obtained by using relevant words from the domain-specific test corpus ). The approach has four main stages: (1) auxiliary corpus generation; (2) related features extraction (from the auxiliary corpus); (3) test features extraction (from the test corpus); and (4) features integration . The proposed approach has been tested on domain-specific corpora (Sports and Finance) and on one balanced corpus, BNC. Even though our WSD approach showed some limitations when dealing with the general-domain corpus, the obtained results for domain-specific corpora, which are our main interest, were better than those reported in previous works. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
08852308
Volume :
41
Database :
Academic Search Index
Journal :
Computer Speech & Language
Publication Type :
Academic Journal
Accession number :
117894321
Full Text :
https://doi.org/10.1016/j.csl.2016.06.003