Back to Search Start Over

An easily implemented method for abbreviation expansion for the medical domain in Japanese text. A preliminary study.

Authors :
Shinohara EY
Aramaki E
Imai T
Miura Y
Tonoike M
Ohkuma T
Masuichi H
Ohe K
Shinohara, E Y
Aramaki, E
Imai, T
Miura, Y
Tonoike, M
Ohkuma, T
Masuichi, H
Ohe, K
Source :
Methods of Information in Medicine; 2013, Vol. 52 Issue 1, p51-61, 11p
Publication Year :
2013

Abstract

<bold>Background: </bold>One of the barriers for the effective use of computerized health-care related text is the ambiguity of abbreviations. To date, the task of disambiguating abbreviations has been treated as a classification task based on surrounding words. Application of this framework for languages that have no word boundaries requires pre-processing to segment a sentence into separate word sequences. While the segmentation processing is often a source of problem, it is unknown whether word information is really requisite for abbreviation expansion.<bold>Objectives: </bold>The present study examined and compared abbreviation expansion methods with and without the incorporation of word information as a preliminary study.<bold>Methods: </bold>We implemented two abbreviation expansion methods: 1) a morpheme-based method that relied on word information and therefore required pre-processing, and 2) a character-based method that relied on simple character information. We compared the expansion accuracies for these two methods using eight medical abbreviations. Experimental data were automatically built as a pseudo-annotated corpus using the Internet.<bold>Results: </bold>As a result of the experiment, accuracies for the character-based method were from 0.890 to 0.942 while accuracies for the morpheme-based method were from 0.796 to 0.932. The character-based method significantly outperformed the morpheme-based method for three of the eight abbreviations (p < 0.05). For the remaining five abbreviations, no significant differences were found between the two methods.<bold>Conclusions: </bold>Character information may be a good alternative in terms of simplicity to morphological information for abbreviation expansion in English medical abbreviations appeared in Japanese texts on the Internet. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00261270
Volume :
52
Issue :
1
Database :
Complementary Index
Journal :
Methods of Information in Medicine
Publication Type :
Academic Journal
Accession number :
107879146
Full Text :
https://doi.org/10.3414/ME12-01-0040