Back to Search
Start Over
Castell: Scalable Joint Probability Estimation of Multi-dimensional Data Randomized with Local Differential Privacy
- Publication Year :
- 2022
-
Abstract
- Performing randomized response (RR) over multi-dimensional data is subject to the curse of dimensionality. As the number of attributes increases, the exponential growth in the number of attribute-value combinations greatly impacts the computational cost and the accuracy of the RR estimates. In this paper, we propose a new multi-dimensional RR scheme that randomizes all attributes independently, and then aggregates these randomization matrices into a single aggregated matrix. The multi-dimensional joint probability distributions are then estimated. The inverse matrix of the aggregated randomization matrix can be computed efficiently at a lightweight computation cost (i.e., linear with respect to dimensionality) and with manageable storage requirements. To overcome the limitation of accuracy, we propose two extensions to the baseline protocol, called {\em hybrid} and {\em truncated} schemes. Finally, we have conducted experiments using synthetic and major open-source datasets for various numbers of attributes, domain sizes, and numbers of respondents. The results using UCI Adult dataset give average distances between the estimated and the real (2 through 6-way) joint probability are $0.0099$ for {\em truncated} and $0.0155$ for {\em hybrid} schemes, whereas they are $0.03$ and $0.04$ for LoPub, which is the state-of-the-art multi-dimensional LDP scheme.<br />Comment: 12 pages + 5-page appendix
- Subjects :
- Computer Science - Cryptography and Security
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2212.01627
- Document Type :
- Working Paper