Back to Search Start Over

PGPregel: An End-to-End System for Privacy-Preserving Graph Processing in Geo-Distributed Data Centers

Authors :
Amelie Chi Zhou
Ruibo Qiu
Thomas Lambert
Tristan Allard
Shadi Ibrahim
Amr El Abbadi
Shenzhen University [Shenzhen]
Web Scale Trustworthy Collaborative Service Systems (COAST)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Networks, Systems and Services (LORIA - NSS)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Security & PrIvaCY (SPICY)
SYSTÈMES LARGE ÉCHELLE (IRISA-D1)
Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA)
Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA)
Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)
Design and Implementation of Autonomous Distributed Systems (MYRIADS)
Inria Rennes – Bretagne Atlantique
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1)
Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique)
University of California [Santa Barbara] (UC Santa Barbara)
University of California (UC)
In addition to the ANR projects cited above, this work is supported by the National Natural Science Foundation of China 62172282, Guangdong Natural Science Foundation (2022A1515010122, 2019A1515012053), Guangdong Provincial Key Laboratory of Popular High Performance Computers, Shenzhen Science and Technology Foundation JCYJ20210324093212034, the Tencent 'Rhinoceros Birds' - Scientific Research Foundation for Young Teachers of Shenzhen University.
Association for Computing Machinery
Ada Gavrilovska
Deniz Altınbüken
Carsten Binnig
ANR-16-CE25-0014,KerStream,Traitement de données massives: allons au-delà d'Hadoop!(2016)
ANR-16-CE23-0004,CROWDGUARD,Confidentialité et efficacité dans les plates-formes de crowdsourcing(2016)
Source :
Proceedings of the 13th Symposium on Cloud Computing, SoCC '22: ACM Symposium on Cloud Computing, SoCC '22: ACM Symposium on Cloud Computing, Association for Computing Machinery, Nov 2022, San Francisco California, United States. pp.386-402, ⟨10.1145/3542929.3563474⟩
Publication Year :
2022
Publisher :
HAL CCSD, 2022.

Abstract

International audience; Graph processing is a popular computing model for big data analytics. Emerging big data applications are often maintained in multiple geographically distributed (geo-distributed) data centers (DCs) to provide low-latency services to global users. Graph processing in geo-distributed DCs suffers from costly inter-DC data communications. Furthermore, due to increasing privacy concerns, geo-distribution imposes diverse, strict, and often asymmetric privacy regulations that constrain geo-distributed graph processing. Existing graph processing systems fail to address these two challenges. In this paper, we design and implement PGPregel, which is an end-to-end system that provides privacy-preserving graph processing in geo-distributed DCs with low latency and high utility. To ensure privacy, PGPregel smartly integrates Differential Privacy into graph processing systems with the help of two core techniques, namely sampling and combiners, to reduce the amount of inter-DC data transfer while preserving good accuracy of graph processing results. We implement our design in Giraph and evaluate it in real cloud DCs. Results show that PGPregel can preserve the privacy of graph data with low overhead and good accuracy.

Details

Language :
English
Database :
OpenAIRE
Journal :
Proceedings of the 13th Symposium on Cloud Computing, SoCC '22: ACM Symposium on Cloud Computing, SoCC '22: ACM Symposium on Cloud Computing, Association for Computing Machinery, Nov 2022, San Francisco California, United States. pp.386-402, ⟨10.1145/3542929.3563474⟩
Accession number :
edsair.doi.dedup.....670b64017007befbbb6dfb6b0467d32f