Start Over

Topology design and graph embedding for decentralized federated learning

Authors :: Yubin Duan
Xiuqi Li
Jie Wu
Source :: Intelligent and Converged Networks, Vol 5, Iss 2, Pp 100-115 (2024)
Publication Year :: 2024
Publisher :: Tsinghua University Press, 2024.
Abstract: Federated learning has been widely employed in many applications to protect the data privacy of participating clients. Although the dataset is decentralized among training devices in federated learning, the model parameters are usually stored in a centralized manner. Centralized federated learning is easy to implement; however, a centralized scheme causes a communication bottleneck at the central server, which may significantly slow down the training process. To improve training efficiency, we investigate the decentralized federated learning scheme. The decentralized scheme has become feasible with the rapid development of device-to-device communication techniques under 5G. Nevertheless, the convergence rate of learning models in the decentralized scheme depends on the network topology design. We propose optimizing the topology design to improve training efficiency for decentralized federated learning, which is a non-trivial problem, especially when considering data heterogeneity. In this paper, we first demonstrate the advantage of hypercube topology and present a hypercube graph construction method to reduce data heterogeneity by carefully selecting neighbors of each training device—a process that resembles classic graph embedding. In addition, we propose a heuristic method for generating torus graphs. Moreover, we have explored the communication patterns in hypercube topology and propose a sequential synchronization scheme to reduce communication cost during training. A batch synchronization scheme is presented to fine-tune the communication pattern for hypercube topology. Experiments on real-world datasets show that our proposed graph construction methods can accelerate the training process, and our sequential synchronization scheme can significantly reduce the overall communication traffic during training.