Author: "Randall, Sean" / Publisher: biomed central - Searchworks@Jio Institute Digital Library Search Results

1. A blinded evaluation of privacy preserving record linkage with Bloom filters

Author: Randall, Sean, Wichmann, Helen, Brown, Adrian, Boyd, James, Eitelhuber, Tom, Merchant, Alexandra, and Ferrante, Anna
Published: 2022
Full Text: View/download PDF

2. Australian general practitioner perceptions to sharing clinical data for secondary use: a mixed method approach

Author: Varhol, Richard J., Randall, Sean, Boyd, James H., and Robinson, Suzanne
Published: 2022
Full Text: View/download PDF

3. Long-term mental health outcomes after unintentional burns sustained during childhood: a retrospective cohort study

Author: Duke, Janine M., Randall, Sean M., Vetrichevvel, Thirthar P., McGarry, Sarah, Boyd, James H., Rea, Suzanne, and Wood, Fiona M.
Published: 2018
Full Text: View/download PDF

4. A population-based comparison study of the mental health of patients with intentional and unintentional burns

Author: Vetrichevvel, Thirthar P, Randall, Sean M, Wood, Fiona M, Rea, Suzanne, Boyd, James H, and Duke, Janine M
Published: 2018
Full Text: View/download PDF

5. Sociodemographic differences in linkage error: an examination of four large-scale datasets

Author: Randall, Sean, Brown, Adrian, Boyd, James, Schnell, Rainer, Borgs, Christian, and Ferrante, Anna
Published: 2018
Full Text: View/download PDF

6. A retrospective cohort study to compare post-injury admissions for infectious diseases in burn patients, non-burn trauma patients and uninjured people

Author: Duke, Janine M., Randall, Sean M., Boyd, James H., Fear, Mark W., Rea, Suzanne, and Wood, Fiona M.
Published: 2018
Full Text: View/download PDF

7. Estimating parameters for probabilistic linkage of privacy-preserved datasets.

Author: Brown, Adrian P., Randall, Sean M., Ferrante, Anna M., Semmens, James B., and Boyd, James H.
Subjects: *MEDICAL records, *PROBABILITY theory, *DATA quality, *ALGORITHMS, *ESTIMATION theory, *MEDICAL ethics, *MEDICAL record linkage, *PRIVACY, *DATA security, *ACQUISITION of data, RESEARCH evaluation
Abstract: Background: Probabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the probabilities and threshold values for probabilistic privacy-preserved record linkage using Bloom filters.Methods: Our method was tested through a simulation study using synthetic data, followed by an application using real-world administrative data. Synthetic datasets were generated with error rates from zero to 20% error. Our method was used to estimate parameters (probabilities and thresholds) for de-duplication linkages. Linkage quality was determined by F-measure. Each dataset was privacy-preserved using separate Bloom filters for each field. Match probabilities were estimated using the expectation-maximisation (EM) algorithm on the privacy-preserved data. Threshold cut-off values were determined by an extension to the EM algorithm allowing linkage quality to be estimated for each possible threshold. De-duplication linkages of each privacy-preserved dataset were performed using both estimated and calculated probabilities. Linkage quality using the F-measure at the estimated threshold values was also compared to the highest F-measure. Three large administrative datasets were used to demonstrate the applicability of the probability and threshold estimation technique on real-world data.Results: Linkage of the synthetic datasets using the estimated probabilities produced an F-measure that was comparable to the F-measure using calculated probabilities, even with up to 20% error. Linkage of the administrative datasets using estimated probabilities produced an F-measure that was higher than the F-measure using calculated probabilities. Further, the threshold estimation yielded results for F-measure that were only slightly below the highest possible for those probabilities.Conclusions: The method appears highly accurate across a spectrum of datasets with varying degrees of error. As there are few alternatives for parameter estimation, the approach is a major step towards providing a complete operational approach for probabilistic linkage of privacy-preserved datasets. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

8. Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets.

Author: Brown, Adrian P., Borgs, Christian, Randall, Sean M., and Schnell, Rainer
Subjects: PRIVACY, MEDICAL records, CRYPTOGRAPHIC equipment, MEDICAL databases, MEDICAL research, DATABASES, HOSPITAL care, MEDICAL ethics, MEDICAL record linkage, STANDARDS
Abstract: Background: Integrating medical data using databases from different sources by record linkage is a powerful technique increasingly used in medical research. Under many jurisdictions, unique personal identifiers needed for linking the records are unavailable. Since sensitive attributes, such as names, have to be used instead, privacy regulations usually demand encrypting these identifiers. The corresponding set of techniques for privacy-preserving record linkage (PPRL) has received widespread attention. One recent method is based on Bloom filters. Due to superior resilience against cryptographic attacks, composite Bloom filters (cryptographic long-term keys, CLKs) are considered best practice for privacy in PPRL. Real-world performance of these techniques using large-scale data is unknown up to now.Methods: Using a large subset of Australian hospital admission data, we tested the performance of an innovative PPRL technique (CLKs using multibit trees) against a gold-standard derived from clear-text probabilistic record linkage. Linkage time and linkage quality (recall, precision and F-measure) were evaluated.Results: Clear text probabilistic linkage resulted in marginally higher precision and recall than CLKs. PPRL required more computing time but 5 million records could still be de-duplicated within one day. However, the PPRL approach required fine tuning of parameters.Conclusions: We argue that increased privacy of PPRL comes with the price of small losses in precision and recall and a large increase in computational burden and setup time. These costs seem to be acceptable in most applied settings, but they have to be considered in the decision to apply PPRL. Further research on the optimal automatic choice of parameters is needed. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

9. Accuracy and completeness of patient pathways - the benefits of national data linkage in Australia.

Author: Boyd, James H., Randall, Sean M., Ferrante, Anna M., Bauer, Jacqueline K., McInneny, Kevin, Brown, Adrian P., Spilsbury, Katrina, Gillies, Margo, and Semmens, James B.
Subjects: *ACCURACY, *COMPLETENESS theorem, *HOSPITAL records, *HOSPITAL admission & discharge, *DEATH rate
Abstract: Background: The technical challenges associated with national data linkage, and the extent of cross-border population movements, are explored as part of a pioneering research project. The project involved linking state-based hospital admission records and death registrations across Australia for a national study of hospital related deaths. Methods: The project linked over 44 million morbidity and mortality records from four Australian states between 1st July 1999 and 31st December 2009 using probabilistic methods. The accuracy of the linkage was measured through a comparison with jurisdictional keys sourced from individual states. The extent of cross-border population movement between these states was also assessed. Results: Data matching identified almost twelve million individuals across the four Australian states. The percentage of individuals from one state with records found in another ranged from 3-5 %. Using jurisdictional keys to measure linkage quality, results indicate a high matching efficiency (F measure 97 to 99 %), with linkage processing taking only a matter of days. Conclusions: The results demonstrate the feasibility and accuracy of undertaking cross jurisdictional linkage for national research. The benefits are substantial, particularly in relation to capturing the full complement of records in patient pathways as a result of cross-border population movements. The project identified a sizeable 'mobile' population with hospital records in more than one state. Research studies that focus on a single jurisdiction will under-enumerate the extent of hospital usage by individuals in the population. It is important that researchers understand and are aware of the impact of this missing hospital activity on their studies. The project highlights the need for an efficient and accurate data linkage system to support national research across Australia. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

10. The effect of data cleaning on record linkage quality.

Author: Randall, Sean M., Ferrante, Anna M., Boyd, James H., and Semmens, James B.
Subjects: *INTEGRATED software, *LINKED data (Semantic Web), *QUALITY control, *DATABASE management, *INFORMATION resources
Abstract: Background: Within the field of record linkage, numerous data cleaning and standardisation techniques are employed to ensure the highest quality of links. While these facilities are common in record linkage software packages and are regularly deployed across record linkage units, little work has been published demonstrating the impact of data cleaning on linkage quality. Methods: A range of cleaning techniques was applied to both a synthetically generated dataset and a large administrative dataset previously linked to a high standard. The effect of these changes on linkage quality was investigated using pairwise F-measure to determine quality. Results: Data cleaning made little difference to the overall linkage quality, with heavy cleaning leading to a decrease in quality. Further examination showed that decreases in linkage quality were due to cleaning techniques typically reducing the variability - although correct records were now more likely to match, incorrect records were also more likely to match, and these incorrect matches outweighed the correct matches, reducing quality overall. Conclusions: Data cleaning techniques have minimal effect on linkage quality. Care should be taken during the data cleaning process. [ABSTRACT FROM AUTHOR]
Published: 2013
Full Text: View/download PDF

11. Data linkage infrastructure for cross-jurisdictional health-related research in Australia.

Author: Boyd, James H., Ferrante, Anna M., O'Keefe, Christine M., Bass, Alfred J., Randall, Sean M., and Semmens, James B.
Subjects: MEDICAL care, EMPLOYMENT practices, CONFIDENTIAL communications
Abstract: Background: The Centre for Data Linkage (CDL) has been established to enable national and cross-jurisdictional health-related research in Australia. It has been funded through the Population Health Research Network (PHRN), a national initiative established under the National Collaborative Research Infrastructure Strategy (NCRIS). This paper describes the development of the processes and methodology required to create cross-jurisdictional research infrastructure and enable aggregation of State and Territory linkages into a single linkage "map". Methods: The CDL has implemented a linkage model which incorporates best practice in data linkage and adheres to data integration principles set down by the Australian Government. Working closely with data custodians and State-based data linkage facilities, the CDL has designed and implemented a linkage system to enable research at national or cross-jurisdictional level. A secure operational environment has also been established with strong governance arrangements to maximise privacy and the confidentiality of data. Results: The development and implementation of a cross-jurisdictional linkage model overcomes a number of challenges associated with the federated nature of health data collections in Australia. The infrastructure expands Australia's data linkage capability and provides opportunities for population-level research. The CDL linkage model, infrastructure architecture and governance arrangements are presented. The quality and capability of the new infrastructure is demonstrated through the conduct of data linkage for the first PHRN Proof of Concept Collaboration project, where more than 25 million records were successfully linked to a very high quality. Conclusions: This infrastructure provides researchers and policy-makers with the ability to undertake linkage-based research that extends across jurisdictional boundaries. It represents an advance in Australia's national data linkage capabilities and sets the scene for stronger government-research collaboration. [ABSTRACT FROM AUTHOR]
Published: 2012
Full Text: View/download PDF

12. Technical challenges of providing record linkage services for research.

Author: Boyd, James H, Randall, Sean M, Ferrante, Anna M, Bauer, Jacqueline K, Brown, Adrian P, and Semmens, James B
Subjects: *INFORMATION retrieval standards, *BAR code standards, *MANAGEMENT of medical records, *ACQUISITION of data, *STANDARDS, ELECTRONIC health record standards
Abstract: Background: Record linkage techniques are widely used to enable health researchers to gain event based longitudinal information for entire populations. The task of record linkage is increasingly being undertaken by specialised linkage units (SLUs). In addition to the complexity of undertaking probabilistic record linkage, these units face additional technical challenges in providing record linkage 'as a service' for research. The extent of this functionality, and approaches to solving these issues, has had little focus in the record linkage literature. Few, if any, of the record linkage packages or systems currently used by SLUs include the full range of functions required.Methods: This paper identifies and discusses some of the functions that are required or undertaken by SLUs in the provision of record linkage services. These include managing routine, on-going linkage; storing and handling changing data; handling different linkage scenarios; accommodating ever increasing datasets. Automated linkage processes are one way of ensuring consistency of results and scalability of service.Results: Alternative solutions to some of these challenges are presented. By maintaining a full history of links, and storing pairwise information, many of the challenges around handling 'open' records, and providing automated managed extractions are solved. A number of these solutions were implemented as part of the development of the National Linkage System (NLS) by the Centre for Data Linkage (part of the Population Health Research Network) in Australia.Conclusions: The demand for, and complexity of, linkage services is growing. This presents as a challenge to SLUs as they seek to service the varying needs of dozens of research projects annually. Linkage units need to be both flexible and scalable to meet this demand. It is hoped the solutions presented here can help mitigate these difficulties. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

12 results on '"Randall, Sean"'

1. A blinded evaluation of privacy preserving record linkage with Bloom filters

2. Australian general practitioner perceptions to sharing clinical data for secondary use: a mixed method approach

3. Long-term mental health outcomes after unintentional burns sustained during childhood: a retrospective cohort study

4. A population-based comparison study of the mental health of patients with intentional and unintentional burns

5. Sociodemographic differences in linkage error: an examination of four large-scale datasets

6. A retrospective cohort study to compare post-injury admissions for infectious diseases in burn patients, non-burn trauma patients and uninjured people

7. Estimating parameters for probabilistic linkage of privacy-preserved datasets.

8. Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets.

9. Accuracy and completeness of patient pathways - the benefits of national data linkage in Australia.

10. The effect of data cleaning on record linkage quality.

11. Data linkage infrastructure for cross-jurisdictional health-related research in Australia.

12. Technical challenges of providing record linkage services for research.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

12 results on '"Randall, Sean"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources