1. Scalable DP-SGD: Shuffling vs. Poisson Subsampling
- Author
-
Chua, Lynn, Ghazi, Badih, Kamath, Pritish, Kumar, Ravi, Manurangsi, Pasin, Sinha, Amer, and Zhang, Chiyuan
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Data Structures and Algorithms - Abstract
We provide new lower bounds on the privacy guarantee of the multi-epoch Adaptive Batch Linear Queries (ABLQ) mechanism with shuffled batch sampling, demonstrating substantial gaps when compared to Poisson subsampling; prior analysis was limited to a single epoch. Since the privacy analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) is obtained by analyzing the ABLQ mechanism, this brings into serious question the common practice of implementing shuffling-based DP-SGD, but reporting privacy parameters as if Poisson subsampling was used. To understand the impact of this gap on the utility of trained machine learning models, we introduce a practical approach to implement Poisson subsampling at scale using massively parallel computation, and efficiently train models with the same. We compare the utility of models trained with Poisson-subsampling-based DP-SGD, and the optimistic estimates of utility when using shuffling, via our new lower bounds on the privacy guarantee of ABLQ with shuffling., Comment: To appear at NeurIPS 2024
- Published
- 2024