Back to Search Start Over

Exploiting parallel corpora to scale up multilingual biomedical terminologies.

Authors :
Hellrich J
Hahn U
Source :
Studies in health technology and informatics [Stud Health Technol Inform] 2014; Vol. 205, pp. 575-8.
Publication Year :
2014

Abstract

Creating and maintaining biomedical terminologies for multiple natural languages is a resource-expensive task, typically carried out by human domain experts. We here report on efforts to computationally support this process by treating term acquisition as a machine translation-guided classification problem capitalizing on parallel corpora. We report on experiments for French, German, Spanish and Dutch parts of a UMLS-derived terminology for which we generated 18 k, 23 k, 19 k and 12 k new terms and synonyms, respectively. Based on expert assessments of a novel German terminology segment about 80% of the newly acquired terms were judged as bio-medically reasonable and terminologically valid.

Details

Language :
English
ISSN :
1879-8365
Volume :
205
Database :
MEDLINE
Journal :
Studies in health technology and informatics
Publication Type :
Academic Journal
Accession number :
25160251