Back to Search
Start Over
Medical record linkage in health information systems by approximate string matching and clustering.
- Source :
-
BMC medical informatics and decision making [BMC Med Inform Decis Mak] 2005 Oct 11; Vol. 5, pp. 32. Date of Electronic Publication: 2005 Oct 11. - Publication Year :
- 2005
-
Abstract
- Background: Multiplication of data sources within heterogeneous healthcare information systems always results in redundant information, split among multiple databases. Our objective is to detect exact and approximate duplicates within identity records, in order to attain a better quality of information and to permit cross-linkage among stand-alone and clustered databases. Furthermore, we need to assist human decision making, by computing a value reflecting identity proximity.<br />Methods: The proposed method is in three steps. The first step is to standardise and to index elementary identity fields, using blocking variables, in order to speed up information analysis. The second is to match similar pair records, relying on a global similarity value taken from the Porter-Jaro-Winkler algorithm. And the third is to create clusters of coherent related records, using graph drawing, agglomerative clustering methods and partitioning methods.<br />Results: The batch analysis of 300,000 "supposedly" distinct identities isolates 240,000 true unique records, 24,000 duplicates (clusters composed of 2 records) and 3,000 clusters whose size is greater than or equal to 3 records.<br />Conclusion: Duplicate-free databases, used in conjunction with relevant indexes and similarity values, allow immediate (i.e. real-time) proximity detection when inserting a new identity.
- Subjects :
- Algorithms
Electronic Data Processing
France
Humans
Information Storage and Retrieval
Medical Informatics Computing
Names
Time Factors
Cluster Analysis
Databases as Topic
Hospital Information Systems
Medical Record Linkage methods
Medical Records Systems, Computerized
Patient Identification Systems classification
Subjects
Details
- Language :
- English
- ISSN :
- 1472-6947
- Volume :
- 5
- Database :
- MEDLINE
- Journal :
- BMC medical informatics and decision making
- Publication Type :
- Academic Journal
- Accession number :
- 16219102
- Full Text :
- https://doi.org/10.1186/1472-6947-5-32