1. A Unified Machine Reading Comprehension Framework for Cohort Selection
- Author
-
Qingcai Chen, Buzhou Tang, Ying Xiong, Zhengxing Huang, and Weihua Peng
- Subjects
Computer science ,Health Informatics ,computer.software_genre ,Cohort Studies ,Health Information Management ,Electronic Health Records ,Humans ,Electrical and Electronic Engineering ,Selection (genetic algorithm) ,Natural Language Processing ,Mathematical logic ,business.industry ,Mechanism (biology) ,Patient Selection ,Computer Science Applications ,Comprehension ,Cohort ,Benchmark (computing) ,Artificial intelligence ,business ,Machine reading ,computer ,Algorithms ,Natural language processing ,Meaning (linguistics) - Abstract
Cohort selection is an essential prerequisite for clinical research, determining whether an individual satisfies given selection criteria. Previous works for cohort selection usually treated each selection criterion independently and ignored not only the meaning of each selection criterion but the relations among cohort selection criteria. To solve the problems above, we propose a novel unified machine reading comprehension (MRC) framework. In this MRC framework, we design simple rules to generate questions for each criterion from cohort selection guidelines and treat clues extracted by trigger words from patients' medical records as passages. A series of state-of-the-art MRC models based on BiDAF, BIMPM, BERT, BioBERT, NCBI-BERT, and RoBERTa are deployed to determine which question and passage pairs match. We also introduce a cross-criterion attention mechanism on representations of question and passage pairs to model relations among cohort selection criteria. Results on two datasets, that is, the dataset of the 2018 National NLP Clinical Challenge (N2C2) for cohort selection and a dataset from the MIMIC-III dataset, show that our NCBI-BERT MRC model with cross-criterion attention mechanism achieves the highest micro-averaged F1-score of 0.9070 on the N2C2 dataset and 0.8353 on the MIMIC-III dataset. It is competitive to the best system that relies on a large number of rules defined by medical experts on the N2C2 dataset. Comparing these two models, we find that the NCBI-BERT MRC model mainly performs worse on mathematical logic criteria. When using rules instead of the NCBI-BERT MRC model on some criteria regarding mathematical logic on the N2C2 dataset, we obtain a new benchmark with an F1-score of 0.9163, indicating that it is easy to integrate rules into MRC models for improvement.
- Published
- 2022
- Full Text
- View/download PDF