Back to Search Start Over

Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables

Authors :
Michel H. Hof
Aeilko H. Zwinderman
Epidemiology and Data Science
Amsterdam Public Health
Source :
Statistics in medicine, 31(30), 4231-4242. John Wiley and Sons Ltd
Publication Year :
2012
Publisher :
Wiley, 2012.

Abstract

In record linkage studies, unique identifiers are often not available, and therefore, the linkage procedure depends on combinations of partially identifying variables with low discriminating power. As a consequence, wrongly linked covariate and outcome pairs will be created and bias further analysis of the linked data. In this article, we investigated two estimators that correct for linkage error in regression analysis. We extended the estimators developed by Lahiri and Larsen and also suggested a weighted least squares approach to deal with linkage error. We considered both linear and logistic regression problems and evaluated the performance of both methods with simulations. Our results show that all wrong covariate and outcome pairs need to be removed from the analysis in order to calculate unbiased regression coefficients in both approaches. This removal requires strong assumptions on the structure of the data. In addition, the bias significantly increases when the assumptions do not hold and wrongly linked records influence the coefficient estimation. Our simulations showed that both methods had similar performance in linear regression problems. With logistic regression problems, the weighted least squares method showed less bias. Because the specific structure of the data in record linkage problems often leads to different assumptions, it?is necessary that the analyst has prior knowledge on the nature of the data. These assumptions are more easily introduced in the weighted least squares approach than in the Lahiri and Larsen estimator. Copyright (c) 2012 John Wiley & Sons, Ltd

Details

ISSN :
02776715
Volume :
31
Database :
OpenAIRE
Journal :
Statistics in Medicine
Accession number :
edsair.doi.dedup.....04d5967a421fefa005d9ca515e225499