Back to Search
Start Over
The Challenge of Pairing Big Datasets: Probabilistic Record Linkage Methods and Diagnosis of Their Empirical Viability
- Source :
- Journal of Business Cycle Research. 16:35-57
- Publication Year :
- 2020
- Publisher :
- Springer Science and Business Media LLC, 2020.
-
Abstract
- In this paper, we evaluated the predictive performance of probabilistic record linkage algorithms, discussing the implications of different configurations of blocking keys, string similarity functions and phonetic code on the prediction’s overall performance and computational complexity. Furthermore, we carried out a bibliographical survey of the main deterministic and probabilistic record linkage methods, as well as of recent advances combining machine learning techniques and main packages and implementations available in open-source R language. The results can provide heuristics for problems of administrative records integration at the national level and have potential value for the formulation and evaluation of public policies.
- Subjects :
- Economics and Econometrics
Computational complexity theory
Computer science
business.industry
Big data
Probabilistic logic
Machine learning
computer.software_genre
Code (cryptography)
Artificial intelligence
Statistics, Probability and Uncertainty
Business and International Management
String metric
business
Heuristics
Implementation
computer
Finance
Record linkage
Subjects
Details
- ISSN :
- 25097970 and 25097962
- Volume :
- 16
- Database :
- OpenAIRE
- Journal :
- Journal of Business Cycle Research
- Accession number :
- edsair.doi...........0b957ef25a40be46dcf9720c13eb870a
- Full Text :
- https://doi.org/10.1007/s41549-020-00043-1