Back to Search
Start Over
Unsupervised Word Categorization Using Self-Organizing Maps and Automatically Extracted Morphs.
- Source :
- Intelligent Data Engineering & Automated Learning - IDEAL 2006; 2006, p912-919, 8p
- Publication Year :
- 2006
-
Abstract
- Automatic creation of syntactic and semantic word categorizations is a challenging problem for highly inflecting languages due to excessive data sparsity. Moreover, the study of colloquial language resources requires the utilization of fully corpus-based tools. We present a completely automated approach for producing word categorizations for morphologically rich languages. Self-Organizing Map (SOM) is utilized for clustering words based on the morphological properties of the context words. These properties are extracted using an automated morphological segmentation algorithm called Morfessor. Our experiments on a colloquial Finnish corpus of stories told by young children show that utilizing unsupervised morphs as features leads to clearly improved clusterings when compared to the use of whole context words as features. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISBNs :
- 9783540454854
- Database :
- Complementary Index
- Journal :
- Intelligent Data Engineering & Automated Learning - IDEAL 2006
- Publication Type :
- Book
- Accession number :
- 32914238
- Full Text :
- https://doi.org/10.1007/11875581_109