Back to Search
Start Over
An easily implemented method for abbreviation expansion for the medical domain in Japanese text. A preliminary study.
- Source :
- Methods of Information in Medicine; 2013, Vol. 52 Issue 1, p51-61, 11p
- Publication Year :
- 2013
-
Abstract
- <bold>Background: </bold>One of the barriers for the effective use of computerized health-care related text is the ambiguity of abbreviations. To date, the task of disambiguating abbreviations has been treated as a classification task based on surrounding words. Application of this framework for languages that have no word boundaries requires pre-processing to segment a sentence into separate word sequences. While the segmentation processing is often a source of problem, it is unknown whether word information is really requisite for abbreviation expansion.<bold>Objectives: </bold>The present study examined and compared abbreviation expansion methods with and without the incorporation of word information as a preliminary study.<bold>Methods: </bold>We implemented two abbreviation expansion methods: 1) a morpheme-based method that relied on word information and therefore required pre-processing, and 2) a character-based method that relied on simple character information. We compared the expansion accuracies for these two methods using eight medical abbreviations. Experimental data were automatically built as a pseudo-annotated corpus using the Internet.<bold>Results: </bold>As a result of the experiment, accuracies for the character-based method were from 0.890 to 0.942 while accuracies for the morpheme-based method were from 0.796 to 0.932. The character-based method significantly outperformed the morpheme-based method for three of the eight abbreviations (p < 0.05). For the remaining five abbreviations, no significant differences were found between the two methods.<bold>Conclusions: </bold>Character information may be a good alternative in terms of simplicity to morphological information for abbreviation expansion in English medical abbreviations appeared in Japanese texts on the Internet. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 00261270
- Volume :
- 52
- Issue :
- 1
- Database :
- Complementary Index
- Journal :
- Methods of Information in Medicine
- Publication Type :
- Academic Journal
- Accession number :
- 107879146
- Full Text :
- https://doi.org/10.3414/ME12-01-0040