Back to Search Start Over

Unsupervised Word Categorization Using Self-Organizing Maps and Automatically Extracted Morphs.

Authors :
Corchado, Emilio
Yin, Hujun
Botti, Vicente
Fyfe, Colin
Klami, Mikaela
Lagus, Krista
Source :
Intelligent Data Engineering & Automated Learning - IDEAL 2006; 2006, p912-919, 8p
Publication Year :
2006

Abstract

Automatic creation of syntactic and semantic word categorizations is a challenging problem for highly inflecting languages due to excessive data sparsity. Moreover, the study of colloquial language resources requires the utilization of fully corpus-based tools. We present a completely automated approach for producing word categorizations for morphologically rich languages. Self-Organizing Map (SOM) is utilized for clustering words based on the morphological properties of the context words. These properties are extracted using an automated morphological segmentation algorithm called Morfessor. Our experiments on a colloquial Finnish corpus of stories told by young children show that utilizing unsupervised morphs as features leads to clearly improved clusterings when compared to the use of whole context words as features. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISBNs :
9783540454854
Database :
Complementary Index
Journal :
Intelligent Data Engineering & Automated Learning - IDEAL 2006
Publication Type :
Book
Accession number :
32914238
Full Text :
https://doi.org/10.1007/11875581_109