Back to Search Start Over

The Challenge of Pairing Big Datasets: Probabilistic Record Linkage Methods and Diagnosis of Their Empirical Viability

Authors :
Lucas Ferreira Mation
Yaohao Peng
Source :
Journal of Business Cycle Research. 16:35-57
Publication Year :
2020
Publisher :
Springer Science and Business Media LLC, 2020.

Abstract

In this paper, we evaluated the predictive performance of probabilistic record linkage algorithms, discussing the implications of different configurations of blocking keys, string similarity functions and phonetic code on the prediction’s overall performance and computational complexity. Furthermore, we carried out a bibliographical survey of the main deterministic and probabilistic record linkage methods, as well as of recent advances combining machine learning techniques and main packages and implementations available in open-source R language. The results can provide heuristics for problems of administrative records integration at the national level and have potential value for the formulation and evaluation of public policies.

Details

ISSN :
25097970 and 25097962
Volume :
16
Database :
OpenAIRE
Journal :
Journal of Business Cycle Research
Accession number :
edsair.doi...........0b957ef25a40be46dcf9720c13eb870a
Full Text :
https://doi.org/10.1007/s41549-020-00043-1