1. Which are the best identifiers for record linkage?
- Author
-
Cyril Ferdynus, Béatrice Gouyon-Cornet, Karima Bourquard, Christine Binquet, R. Pattisina, Allaert François-André, Jean-Bernard Gouyon, and Catherine Quantin
- Subjects
Patient Identification Systems ,Computer science ,media_common.quotation_subject ,Health Informatics ,Christian name ,computer.software_genre ,Efficiency, Organizational ,Soundex ,Health Information Management ,Humans ,Quality (business) ,Date of birth ,General Nursing ,media_common ,Linkage (software) ,business.industry ,Identifier ,Data mining ,Artificial intelligence ,France ,Medical Record Linkage ,business ,computer ,Record linkage ,Natural language processing ,Confidentiality - Abstract
As a linkage using less informative identifiers could lead to linkage errors, it is essential to quantify the information associated to each identifier. The aim of this study was to estimate the discriminating power of different identifiers susceptible to be used in a record linkage process. This work showed the interest of three identifiers when linking data concerning a same patient using an automatic procedure based on the method proposed by Jaro; the date of birth, the first and the last names seemed to be the more appropriate identifiers. Including a poorly discriminating identifier like gender did not improve the results. Moreover, adding a second christian name, often missing, increased linkage errors. On the contrary, it seemed that using a phonetic treatment adapted to the French language could improve the results of linkage in comparison to the Soundex. However, whatever, the method used it seems necessary to improve the quality of identifier collection as it could greatly influence linkage results.
- Published
- 2005