Back to Search Start Over

Feedback Autonomic Provisioning for Guaranteeing Performance in MapReduce Systems

Authors :
Bogdan Robu
Damián Serrano
Sara Bouchenak
Nicolas Marchand
Mihaly Berekmeri
GIPSA - Systèmes non linéaires et complexité (GIPSA-SYSCO)
Département Automatique (GIPSA-DA)
Grenoble Images Parole Signal Automatique (GIPSA-lab )
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Grenoble Images Parole Signal Automatique (GIPSA-lab )
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
Laboratoire d'Informatique de Grenoble (LIG )
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
Distribution, Recherche d'Information et Mobilité (DRIM)
Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS)
Institut National des Sciences Appliquées de Lyon (INSA Lyon)
Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-École Centrale de Lyon (ECL)
Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon)
Université de Lyon-Université Lumière - Lyon 2 (UL2)
Grid'5000
ANR-11-LABX-0025,PERSYVAL-lab,Systemes et Algorithmes Pervasifs au confluent des mondes physique et numérique(2011)
European Project: 610535,EC:FP7:ICT,FP7-ICT-2013-10,AMADEOS(2013)
Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL)
Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL)
Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon)
Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL)
Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)
Source :
IEEE transactions on cloud computing, IEEE transactions on cloud computing, IEEE, 2018, 6 (4), pp.1004-1016. ⟨10.1109/TCC.2016.2550047⟩, IEEE Transactions on Cloud Computing, IEEE Transactions on Cloud Computing, 2018, 6 (4), pp.1004-1016. ⟨10.1109/TCC.2016.2550047⟩
Publication Year :
2018
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2018.

Abstract

International audience; Companies have a fast growing amounts of data to process and store, a data explosion is happening next to us. Currentlyone of the most common approaches to treat these vast data quantities are based on the MapReduce parallel programming paradigm.While its use is widespread in the industry, ensuring performance constraints, while at the same time minimizing costs, still providesconsiderable challenges. We propose a coarse grained control theoretical approach, based on techniques that have already provedtheir usefulness in the control community. We introduce the first algorithm to create dynamic models for Big Data MapReduce systems,running a concurrent workload. Furthermore we identify two important control use cases: relaxed performance - minimal resourceand strict performance. For the first case we develop two feedback control mechanism. A classical feedback controller and an evenbasedfeedback, that minimises the number of cluster reconfigurations as well. Moreover, to address strict performance requirements afeedforward predictive controller that efficiently suppresses the effects of large workload size variations is developed. All the controllersare validated online in a benchmark running in a real 60 node MapReduce cluster, using a data intensive Business Intelligenceworkload. Our experiments demonstrate the success of the control strategies employed in assuring service time constraints.

Details

ISSN :
23720018 and 21687161
Volume :
6
Database :
OpenAIRE
Journal :
IEEE Transactions on Cloud Computing
Accession number :
edsair.doi.dedup.....8f55c0d4fa888ced13b1a9fd15758a74