Back to Search Start Over

Accounting for misclassification in electronic health records-derived exposures using generalized linear finite mixture models.

Authors :
Hubbard, Rebecca
Johnson, Eric
Chubak, Jessica
Wernli, Karen
Kamineni, Aruna
Bogart, Andy
Rutter, Carolyn
Source :
Health Services & Outcomes Research Methodology; Jun2017, Vol. 17 Issue 2, p101-112, 12p
Publication Year :
2017

Abstract

Exposures derived from electronic health records (EHR) may be misclassified, leading to biased estimates of their association with outcomes of interest. An example of this problem arises in the context of cancer screening where test indication, the purpose for which a test was performed, is often unavailable. This poses a challenge to understanding the effectiveness of screening tests because estimates of screening test effectiveness are biased if some diagnostic tests are misclassified as screening. Prediction models have been developed for a variety of exposure variables that can be derived from EHR, but no previous research has investigated appropriate methods for obtaining unbiased association estimates using these predicted probabilities. The full likelihood incorporating information on both the predicted probability of exposure-class membership and the association between the exposure and outcome of interest can be expressed using a finite mixture model. When the regression model of interest is a generalized linear model (GLM), the expectation-maximization algorithm can be used to estimate the parameters using standard software for GLMs. Using simulation studies, we compared the bias and efficiency of this mixture model approach to alternative approaches including multiple imputation and dichotomization of the predicted probabilities to create a proxy for the missing predictor. The mixture model was the only approach that was unbiased across all scenarios investigated. Finally, we explored the performance of these alternatives in a study of colorectal cancer screening with colonoscopy. These findings have broad applicability in studies using EHR data where gold-standard exposures are unavailable and prediction models have been developed for estimating proxies. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13873741
Volume :
17
Issue :
2
Database :
Complementary Index
Journal :
Health Services & Outcomes Research Methodology
Publication Type :
Academic Journal
Accession number :
122685481
Full Text :
https://doi.org/10.1007/s10742-016-0149-5