1. Modulo Based Data Placement Algorithm for Energy Consumption Optimization of MapReduce System
- Author
-
Zhi Wang, Jie Song, Jean-Marc Pierson, Hongyan He, Ge Yu, Northeastern University [Shenyang], Système d’exploitation, systèmes répartis, de l’intergiciel à l’architecture (IRIT-SEPIA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Centre National de la Recherche Scientifique - CNRS (FRANCE), Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Northeastern University (CHINA), Institut de Recherche en Informatique de Toulouse - IRIT (Toulouse, France), and Institut National Polytechnique de Toulouse - INPT (FRANCE)
- Subjects
[INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR] ,Correctness ,Map Reduce ,Computer Networks and Communications ,Computer science ,Distributed computing ,Modulo ,Data management ,Big data ,Système d'exploitation ,Réseaux et télécommunications ,02 engineering and technology ,Data placement ,[INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI] ,Architectures Matérielles ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Production (economics) ,sort ,business.industry ,Energy consumption ,Systèmes embarqués ,Energy consumption optimization ,Hardware and Architecture ,[INFO.INFO-ES]Computer Science [cs]/Embedded Systems ,020201 artificial intelligence & image processing ,[INFO.INFO-OS]Computer Science [cs]/Operating Systems [cs.OS] ,business ,Algorithm ,Software ,Energy (signal processing) ,Information Systems - Abstract
International audience; With the explosion of data production, the efficiency of data management and analysis has been concerned by both industry and academia. Meanwhile, more and more energy is consumed by the IT infrastructure especially the larger scale distributed systems. In this paper, a novel idea for optimizing the Energy Consumption (EC for short) of MapReduce system is proposed. We argue that a fair data placement is helpful to save energy, and then we propose three goals of data placement, and a modulo based Data Placement Algorithm (DPA for short) which achieves these goals. Afterwards, the correctness of the proposed DPA is proved from both theoretical and experimental perspectives. Three different systems which implement MapReduce model with different DPAs are compared in our experiments. Our algorithm is proved to optimize EC effectively, without introducing the additional costs and delaying data loading. With the help of our DPA, the EC for the WordCount , Sort and MRBench can be reduced by 10.9 %, 8.3 % and 17 % respectively, and time consumption is reduced by 7 %, 6.3 % and 7 % respectively.
- Published
- 2016