Back to Search Start Over

A dataset for evaluating Bengali word sense disambiguation techniques.

Authors :
Das Dawn, Debapratim
Khan, Abhinandan
Shaikh, Soharab Hossain
Pal, Rajat Kumar
Source :
Journal of Ambient Intelligence & Humanized Computing; Apr2023, Vol. 14 Issue 4, p4057-4086, 30p
Publication Year :
2023

Abstract

The computation of natural language enables a suitable transmission through the universe by retrieving the correct sense of each word. A word may be monosemous or polysemous. The use of polysemous words in an appropriate context plays a critical role in communication. Over the last 2 decades, a significant amount of research has been done for automatically solving the correct sense of a polysemous word in the context of word sense disambiguation. A word sense disambiguation algorithm identifies the proper sense of a polysemous word by analysing the contextual data. Nevertheless, there is a gap in the contemporary literature regarding the availability of datasets in Asian languages, especially Bengali. Therefore, in this work, we have presented a dataset comprising hundred Bengali polysemous words. Each word in this dataset consists of three or four disjoint senses, and each sense comprises ten paragraphs. Each paragraph describes the sense of a particular polysemous word. We have performed statistical analysis on the basis of seven relevant and important characteristics. A general framework has also been presented for training and testing with possible guidelines for performance analysis. A baseline strategy has been introduced based on four feature sets. Finally, a set of experiments have been performed to analyse the system performance. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18685137
Volume :
14
Issue :
4
Database :
Complementary Index
Journal :
Journal of Ambient Intelligence & Humanized Computing
Publication Type :
Academic Journal
Accession number :
162727722
Full Text :
https://doi.org/10.1007/s12652-022-04471-y