Back to Search Start Over

Castell: Scalable Joint Probability Estimation of Multi-dimensional Data Randomized with Local Differential Privacy

Authors :
Kikuchi, Hiroaki
Publication Year :
2022

Abstract

Performing randomized response (RR) over multi-dimensional data is subject to the curse of dimensionality. As the number of attributes increases, the exponential growth in the number of attribute-value combinations greatly impacts the computational cost and the accuracy of the RR estimates. In this paper, we propose a new multi-dimensional RR scheme that randomizes all attributes independently, and then aggregates these randomization matrices into a single aggregated matrix. The multi-dimensional joint probability distributions are then estimated. The inverse matrix of the aggregated randomization matrix can be computed efficiently at a lightweight computation cost (i.e., linear with respect to dimensionality) and with manageable storage requirements. To overcome the limitation of accuracy, we propose two extensions to the baseline protocol, called {\em hybrid} and {\em truncated} schemes. Finally, we have conducted experiments using synthetic and major open-source datasets for various numbers of attributes, domain sizes, and numbers of respondents. The results using UCI Adult dataset give average distances between the estimated and the real (2 through 6-way) joint probability are $0.0099$ for {\em truncated} and $0.0155$ for {\em hybrid} schemes, whereas they are $0.03$ and $0.04$ for LoPub, which is the state-of-the-art multi-dimensional LDP scheme.<br />Comment: 12 pages + 5-page appendix

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2212.01627
Document Type :
Working Paper