Back to Search Start Over

Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing

Authors :
Chenghao Lyu
Qi Fan
Fei Song
Arnab Sinha
Yanlei Diao
Wei Chen
Li Ma
Yihui Feng
Yaliang Li
Kai Zeng
Jingren Zhou
University of Massachusetts [Amherst] (UMass Amherst)
University of Massachusetts System (UMASS)
Rich Data Analytics at Cloud Scale (CEDAR)
Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX)
École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
École polytechnique (X)
Alibaba Group [Hangzhou]
Source :
VLDB 2022-48th International Conference on Very Large Databases, VLDB 2022-48th International Conference on Very Large Databases, Sep 2022, Sydney, Australia
Publication Year :
2022

Abstract

International audience; Big data processing at the production scale presents a highly complex environment for resource optimization (RO), a problem crucial for meeting performance goals and budgetary constraints of analytical users. The RO problem is challenging because it involves a set of decisions (the partition count, placement of parallel instances on machines, and resource allocation to each instance), requires multi-objective optimization (MOO), and is compounded by the scale and complexity of big data systems while having to meet stringent time constraints for scheduling. This paper presents a MaxCompute based integrated system to support multi-objective resource optimization via ne-grained instance-level modeling and optimization. We propose a new architecture that breaks RO into a series of simpler problems, new ne-grained predictive models, and novel optimization methods that exploit these models to make effective instance-level RO decisions well under a second. Evaluation using production workloads shows that our new RO system could reduce 37-72% latency and 43-78% cost at the same time, compared to the current optimizer and scheduler, while running in 0.02-0.23s.

Details

Language :
English
Database :
OpenAIRE
Journal :
VLDB 2022-48th International Conference on Very Large Databases, VLDB 2022-48th International Conference on Very Large Databases, Sep 2022, Sydney, Australia
Accession number :
edsair.doi.dedup.....3dbc5b8804cff82c864378b1143294c1