Start Over

Identifying Bengali Multiword Expressions using Semantic Clustering

Authors :: Chakraborty, Tanmoy
Das, Dipankar
Bandyopadhyay, Sivaji
Chakraborty, Tanmoy
Das, Dipankar
Bandyopadhyay, Sivaji
Publication Year :: 2014
Abstract: One of the key issues in both natural language understanding and generation is the appropriate processing of Multiword Expressions (MWEs). MWEs pose a huge problem to the precise language processing due to their idiosyncratic nature and diversity in lexical, syntactical and semantic properties. The semantics of a MWE cannot be expressed after combining the semantics of its constituents. Therefore, the formalism of semantic clustering is often viewed as an instrument for extracting MWEs especially for resource constraint languages like Bengali. The present semantic clustering approach contributes to locate clusters of the synonymous noun tokens present in the document. These clusters in turn help measure the similarity between the constituent words of a potentially candidate phrase using a vector space model and judge the suitability of this phrase to be a MWE. In this experiment, we apply the semantic clustering approach for noun-noun bigram MWEs, though it can be extended to any types of MWEs. In parallel, the well known statistical models, namely Point-wise Mutual Information (PMI), Log Likelihood Ratio (LLR), Significance function are also employed to extract MWEs from the Bengali corpus. The comparative evaluation shows that the semantic clustering approach outperforms all other competing statistical models. As a by-product of this experiment, we have started developing a standard lexicon in Bengali that serves as a productive Bengali linguistic thesaurus.<br />Comment: 25 pages, 3 figures, 5 tables, International Journal of Linguistics and Language Resources (Lingvistic{\ae} Investigationes), 2014

Details

Database :: OAIster
Publication Type :: Electronic Resource
Accession number :: edsoai.on1106196667
Document Type :: Electronic Resource

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Identifying Bengali Multiword Expressions using Semantic Clustering

Abstract

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Identifying Bengali Multiword Expressions using Semantic Clustering

Abstract

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources