1. Word Embedding for French Natural Language in Healthcare: A Comparative Study.
- Author
-
Dynomant, Emeric, Lelong, Romain, Dahamna, Badisse, Massonnaud, Clément, Kerdelhué, Gaëtan, Grosjean, Julien, Canu, Stéphane, and Darmoni, Stéfan
- Subjects
NATURAL language processing ,DEEP learning ,EMBEDDED computer systems ,DATA mining ,SEMANTICS - Abstract
Structuring raw medical documents with ontology mapping is now the next step for medical intelligence. Deep learning models take as input mathematically embedded information, such as encoded texts. To do so, word embedding methods can represent every word from a text as a fixed-length vector. A formal evaluation of three word embedding methods has been performed on raw medical documents. The data corresponds to more than 12M diverse documents produced in the Rouen hospital (drug prescriptions, discharge and surgery summaries, inter-services letters, etc.). Automatic and manual validation demonstrates that Word2Vec based on the skip-gram architecture had the best rate on three out of four accuracy tests. This model will now be used as the first layer of an AIbased semantic annotator. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF