Back to Search
Start Over
A dataset for evaluating Bengali word sense disambiguation techniques.
- Source :
- Journal of Ambient Intelligence & Humanized Computing; Apr2023, Vol. 14 Issue 4, p4057-4086, 30p
- Publication Year :
- 2023
-
Abstract
- The computation of natural language enables a suitable transmission through the universe by retrieving the correct sense of each word. A word may be monosemous or polysemous. The use of polysemous words in an appropriate context plays a critical role in communication. Over the last 2 decades, a significant amount of research has been done for automatically solving the correct sense of a polysemous word in the context of word sense disambiguation. A word sense disambiguation algorithm identifies the proper sense of a polysemous word by analysing the contextual data. Nevertheless, there is a gap in the contemporary literature regarding the availability of datasets in Asian languages, especially Bengali. Therefore, in this work, we have presented a dataset comprising hundred Bengali polysemous words. Each word in this dataset consists of three or four disjoint senses, and each sense comprises ten paragraphs. Each paragraph describes the sense of a particular polysemous word. We have performed statistical analysis on the basis of seven relevant and important characteristics. A general framework has also been presented for training and testing with possible guidelines for performance analysis. A baseline strategy has been introduced based on four feature sets. Finally, a set of experiments have been performed to analyse the system performance. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 18685137
- Volume :
- 14
- Issue :
- 4
- Database :
- Complementary Index
- Journal :
- Journal of Ambient Intelligence & Humanized Computing
- Publication Type :
- Academic Journal
- Accession number :
- 162727722
- Full Text :
- https://doi.org/10.1007/s12652-022-04471-y