Back to Search
Start Over
Challenging the Boundaries of Unsupervised Learning for Semantic Similarity
- Source :
- IEEE Access, Vol 7, Pp 16291-16308 (2019)
- Publication Year :
- 2019
- Publisher :
- IEEE, 2019.
-
Abstract
- The semantic analysis field has a crucial role to play in the research related to text analytics. Calculating the semantic similarity between sentences is a long-standing problem in the area of natural language processing, and it differs significantly as the domain of operation differs. In this paper, we present a methodology that can be applied across multiple domains by incorporating corpora-based statistics into a standardized semantic similarity algorithm. To calculate the semantic similarity between words and sentences, the proposed method follows an edge-based approach using a lexical database. When tested on both benchmark standards and mean human similarity dataset, the methodology achieves a high correlation value for both word (r = 0.8753) and sentence similarity (r = 0.8793) concerning Rubenstein and Goodenough standard and the SICK dataset (r = 0.83241) outperforming other unsupervised models.
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 7
- Database :
- Directory of Open Access Journals
- Journal :
- IEEE Access
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.3f7a8ed31b44092b65e5b7ec5685d0f
- Document Type :
- article
- Full Text :
- https://doi.org/10.1109/ACCESS.2019.2891692