1. FLCD: A Flexible Low Complexity Design of Coded Distributed Computing
- Author
-
Xingyue Wang, Mingyue Ji, Rong-Rong Chen, and Nicholas Woolsey
- Subjects
FOS: Computer and information sciences ,Scheme (programming language) ,Speedup ,Computer Networks and Communications ,Computer science ,Computer Science - Information Theory ,Distributed computing ,Computation ,Cloud computing ,Reduction (complexity) ,Constant (computer programming) ,computer.programming_language ,Flexibility (engineering) ,Computer Science - Performance ,business.industry ,Information Theory (cs.IT) ,Computer Science Applications ,Performance (cs.PF) ,Range (mathematics) ,Computer Science - Distributed, Parallel, and Cluster Computing ,Hardware and Architecture ,Distributed, Parallel, and Cluster Computing (cs.DC) ,business ,computer ,Software ,Information Systems - Abstract
We propose a flexible low complexity design (FLCD) of coded distributed computing (CDC) with empirical evaluation on Amazon Elastic Compute Cloud (Amazon EC2). CDC can expedite MapReduce like computation by trading increased map computations to reduce communication load and shuffle time. A main novelty of FLCD is to utilize the design freedom in defining map and reduce functions to develop asymptotic homogeneous systems to support varying intermediate values (IV) sizes under a general MapReduce framework. Compared to existing designs with constant IV sizes, FLCD offers greater flexibility in adapting to network parameters and significantly reduces the implementation complexity by requiring fewer input files and shuffle groups. The FLCD scheme is the first proposed low-complexity CDC design that can operate on a network with an arbitrary number of nodes and computation load. We perform empirical evaluations of the FLCD by executing the TeraSort algorithm on an Amazon EC2 cluster. This is the first time that theoretical predictions of the CDC shuffle time are validated by empirical evaluations. The evaluations demonstrate a 2.0 to 4.24x speedup compared to conventional uncoded MapReduce, a 12% to 52% reduction in total time, and a wider range of operating network parameters compared to existing CDC schemes., 13 pages, 4 figures
- Published
- 2023