Back to Search
Start Over
Feedback Autonomic Provisioning for Guaranteeing Performance in MapReduce Systems
- Source :
- IEEE transactions on cloud computing, IEEE transactions on cloud computing, IEEE, 2018, 6 (4), pp.1004-1016. ⟨10.1109/TCC.2016.2550047⟩, IEEE Transactions on Cloud Computing, IEEE Transactions on Cloud Computing, 2018, 6 (4), pp.1004-1016. ⟨10.1109/TCC.2016.2550047⟩
- Publication Year :
- 2018
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2018.
-
Abstract
- International audience; Companies have a fast growing amounts of data to process and store, a data explosion is happening next to us. Currentlyone of the most common approaches to treat these vast data quantities are based on the MapReduce parallel programming paradigm.While its use is widespread in the industry, ensuring performance constraints, while at the same time minimizing costs, still providesconsiderable challenges. We propose a coarse grained control theoretical approach, based on techniques that have already provedtheir usefulness in the control community. We introduce the first algorithm to create dynamic models for Big Data MapReduce systems,running a concurrent workload. Furthermore we identify two important control use cases: relaxed performance - minimal resourceand strict performance. For the first case we develop two feedback control mechanism. A classical feedback controller and an evenbasedfeedback, that minimises the number of cluster reconfigurations as well. Moreover, to address strict performance requirements afeedforward predictive controller that efficiently suppresses the effects of large workload size variations is developed. All the controllersare validated online in a benchmark running in a real 60 node MapReduce cluster, using a data intensive Business Intelligenceworkload. Our experiments demonstrate the success of the control strategies employed in assuring service time constraints.
- Subjects :
- 0209 industrial biotechnology
Computer Networks and Communications
business.industry
Computer science
Distributed computing
Node (networking)
Big data
Feed forward
Workload
Provisioning
Cloud computing
02 engineering and technology
[SPI.AUTO]Engineering Sciences [physics]/Automatic
Computer Science Applications
Data modeling
020901 industrial engineering & automation
Hardware and Architecture
[INFO.INFO-AU]Computer Science [cs]/Automatic Control Engineering
0202 electrical engineering, electronic engineering, information engineering
Benchmark (computing)
020201 artificial intelligence & image processing
business
Software
Information Systems
Subjects
Details
- ISSN :
- 23720018 and 21687161
- Volume :
- 6
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Cloud Computing
- Accession number :
- edsair.doi.dedup.....8f55c0d4fa888ced13b1a9fd15758a74