Back to Search Start Over

Expected 10-anonymity of HyperLogLog sketches for federated queries of clinical data repositories.

Authors :
Tao Z
Weber GM
Yu YW
Source :
Bioinformatics (Oxford, England) [Bioinformatics] 2021 Jul 12; Vol. 37 (Suppl_1), pp. i151-i160.
Publication Year :
2021

Abstract

Motivation: The rapid growth in of electronic medical records provide immense potential to researchers, but are often silo-ed at separate hospitals. As a result, federated networks have arisen, which allow simultaneously querying medical databases at a group of connected institutions. The most basic such query is the aggregate count-e.g. How many patients have diabetes? However, depending on the protocol used to estimate that total, there is always a tradeoff in the accuracy of the estimate against the risk of leaking confidential data. Prior work has shown that it is possible to empirically control that tradeoff by using the HyperLogLog (HLL) probabilistic sketch.<br />Results: In this article, we prove complementary theoretical bounds on the k-anonymity privacy risk of using HLL sketches, as well as exhibit code to efficiently compute those bounds.<br />Availability and Implementation: https://github.com/tzyRachel/K-anonymity-Expectation.<br /> (© The Author(s) 2021. Published by Oxford University Press.)

Details

Language :
English
ISSN :
1367-4811
Volume :
37
Issue :
Suppl_1
Database :
MEDLINE
Journal :
Bioinformatics (Oxford, England)
Publication Type :
Academic Journal
Accession number :
34252969
Full Text :
https://doi.org/10.1093/bioinformatics/btab292