Back to Search Start Over

The case for phase-aware scheduling of parallelizable jobs

Authors :
Weina Wang
Justin Whitehouse
Benjamin Berg
Mor Harchol-Balter
Benjamin Moseley
Source :
Performance Evaluation. 153:102246
Publication Year :
2022
Publisher :
Elsevier BV, 2022.

Abstract

Parallelizable jobs typically consist of multiple phases of computation, where the job is more parallelizable in some phases and less parallelizable in others. For example, in a database, a query may consist of a highly parallelizable table scan, followed by a less parallelizable table join. In the past, this phase-varying parallelizability was summarized by a single sub-linear speedup curve that measures a job’s average parallelizability over its entire lifetime. Today, however, modern systems have fine-grained knowledge of the exact phase each job is in at every moment in time. Unfortunately, these systems do not fully leverage this real-time feedback when scheduling parallelizable jobs. Theory has failed to produce practical phase-aware scheduling policies, and thus scheduling in current systems is largely heuristic. A phase-aware scheduling policy must decide, given its knowledge of each job’s current phase, how many servers or cores to allocate to each job in the system at every moment in time. This paper provides the first stochastic model of a system processing parallelizable jobs composed of phases. Using our model, we derive an optimal phase-aware scheduling policy that minimizes the mean response time across jobs. Our provably optimal policy, Inelastic-First ( IF ), gives strict priority to jobs that are currently in less parallelizable phases. We validate our theoretical results using a simulation of a database running queries from the Star Schema Benchmark. We compare IF to a range of policies from both systems and theory, and show that IF reduces mean response time by up to a factor of 3.

Details

ISSN :
01665316
Volume :
153
Database :
OpenAIRE
Journal :
Performance Evaluation
Accession number :
edsair.doi.dedup.....a8e43caa99b55a52e6771f7cafd2379a