Back to Search
Start Over
PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records
- Source :
- Journal of the American Medical Informatics Association : JAMIA
- Publication Year :
- 2020
- Publisher :
- Oxford University Press (OUP), 2020.
-
Abstract
- Objective Developing algorithms to extract phenotypes from electronic health records (EHRs) can be challenging and time-consuming. We developed PheMap, a high-throughput phenotyping approach that leverages multiple independent, online resources to streamline the phenotyping process within EHRs. Materials and Methods PheMap is a knowledge base of medical concepts with quantified relationships to phenotypes that have been extracted by natural language processing from publicly available resources. PheMap searches EHRs for each phenotype’s quantified concepts and uses them to calculate an individual’s probability of having this phenotype. We compared PheMap to clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network for type 2 diabetes mellitus (T2DM), dementia, and hypothyroidism using 84 821 individuals from Vanderbilt Univeresity Medical Center's BioVU DNA Biobank. We implemented PheMap-based phenotypes for genome-wide association studies (GWAS) for T2DM, dementia, and hypothyroidism, and phenome-wide association studies (PheWAS) for variants in FTO, HLA-DRB1, and TCF7L2. Results In this initial iteration, the PheMap knowledge base contains quantified concepts for 841 disease phenotypes. For T2DM, dementia, and hypothyroidism, the accuracy of the PheMap phenotypes were >97% using a 50% threshold and eMERGE case-control status as a reference standard. In the GWAS analyses, PheMap-derived phenotype probabilities replicated 43 of 51 previously reported disease-associated variants for the 3 phenotypes. For 9 of the 11 top associations, PheMap provided an equivalent or more significant P value than eMERGE-based phenotypes. The PheMap-based PheWAS showed comparable or better performance to a traditional phecode-based PheWAS. PheMap is publicly available online. Conclusions PheMap significantly streamlines the process of extracting research-quality phenotype information from EHRs, with comparable or better performance to current phenotyping approaches.
- Subjects :
- Adult
0301 basic medicine
endocrine system diseases
AcademicSubjects/SCI01060
Computer science
Knowledge Bases
Information Storage and Retrieval
Health Informatics
Genomics
Genome-wide association study
Computational biology
Research and Applications
Polymorphism, Single Nucleotide
03 medical and health sciences
0302 clinical medicine
Hypothyroidism
Terminology as Topic
medicine
Humans
Dementia
030212 general & internal medicine
natural language processing
Throughput (business)
AcademicSubjects/MED00580
Genetic association
business.industry
Medical record
medicine.disease
Biobank
Phenotype
electronic health records
030104 developmental biology
Diabetes Mellitus, Type 2
Knowledge base
high-throughput phenotyping
AcademicSubjects/SCI01530
business
Algorithms
Genome-Wide Association Study
Subjects
Details
- ISSN :
- 1527974X
- Volume :
- 27
- Database :
- OpenAIRE
- Journal :
- Journal of the American Medical Informatics Association
- Accession number :
- edsair.doi.dedup.....27e22cd18e10a81c212ea3d842c909f3
- Full Text :
- https://doi.org/10.1093/jamia/ocaa104