Back to Search Start Over

Is artificial data useful for biomedical Natural Language Processing algorithms?

Authors :
Sumithra Velupillai
Zixu Wang
Julia Ive
Lucia Specia
Source :
BioNLP@ACL
Publication Year :
2019
Publisher :
Association for Computational Linguistics, 2019.

Abstract

A major obstacle to the development of Natural Language Processing (NLP) methods in the biomedical domain is data accessibility. This problem can be addressed by generating medical data artificially. Most previous studies have focused on the generation of short clinical text, and evaluation of the data utility has been limited. We propose a generic methodology to guide the generation of clinical text with key phrases. We use the artificial data as additional training data in two key biomedical NLP tasks: text classification and temporal relation extraction. We show that artificially generated training data used in conjunction with real training data can lead to performance boosts for data-greedy neural network algorithms. We also demonstrate the usefulness of the generated data for NLP setups where it fully replaces real training data.<br />BioNLP 2019

Details

Database :
OpenAIRE
Journal :
Proceedings of the 18th BioNLP Workshop and Shared Task
Accession number :
edsair.doi.dedup.....11ba22ac46bc28157d6e1619d60c939b
Full Text :
https://doi.org/10.18653/v1/w19-5026