Back to Search Start Over

An Empirical Test of GRUs and Deep Contextualized Word Representations on De- Identification.

Authors :
Lee, Kahyun
Filannino, Michele
Uzuner, Özlem
Source :
Studies in Health Technology & Informatics; 2019, Vol. 264, p218-222, 5p, 8 Charts
Publication Year :
2019

Abstract

De-identification aims to remove 18 categories of protected health information from electronic health records. Ideally, deidentification systems should be reliable and generalizable. Previous research has focused on improving performance but has not examined generalizability. This paper investigates both performance and generalizability. To improve current state-ofthe- art performance based on long short-term memory (LSTM) units, we introduce a system that uses gated recurrent units (GRUs) and deep contextualized word representations, both of which have never been applied to de-identification. We measure performance and generalizability of each system using the 2014 i2b2/UTHealth and 2016 CEGS N-GRID deidentification datasets. We show that deep contextualized word representations improve state-of-the-art performance, while the benefit of switching LSTM units with GRUs is not significant. The generalizability of de-identification system significantly improved with deep contextualized word representations; in addition, LSTM units-based system is more generalizable than the GRUs-based system. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09269630
Volume :
264
Database :
Complementary Index
Journal :
Studies in Health Technology & Informatics
Publication Type :
Academic Journal
Accession number :
138945716
Full Text :
https://doi.org/10.3233/SHTI190215