1. A Constraint Programming Scheduler for Heterogeneous High-Performance Computing Machines
- Author
-
Luca Benini, Thomas Bridi, Michela Milano, Michele Lombardi, Andrea Bartolini, Bridi, Thoma, Bartolini, Andrea, Lombardi, Michele, Milano, Michela, and Benini, Luca
- Subjects
Optimization problem ,Computer science ,Distributed computing ,Scheduling (production processes) ,Processor scheduling ,resource allocation ,02 engineering and technology ,computer.software_genre ,Scheduling (computing) ,Fixed-priority pre-emptive scheduling ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Constraint programming ,Resource management ,scheduling ,020203 distributed computing ,Quality of service ,Computational Theory and Mathematics ,Virtual machine ,Hardware and Architecture ,Scalability ,HPC ,Signal Processing ,Resource allocation ,supercomputer ,020201 artificial intelligence & image processing ,computer ,optimization - Abstract
Scheduling and dispatching tools for high-performance computing (HPC) machines have the key role of mapping jobs to the available resources, trying to maximize performance and quality-of-service (QoS). Allocation and Scheduling in the general case are well-known NP-hard problems, forcing commercial schedulers to adopt greedy approaches to improve performance and QoS. Search-based approaches featuring the exploration of the solution space have seldom been employed in this setting, but mostly applied in off-line scenarios. In this paper, we present the first search-based approach to job allocation and scheduling for HPC machines, working in a production environment. The scheduler is based on Constraint Programming, an effective programming technique for optimization problems. The resulting scheduler is flexible, as it can be easily customized for dealing with heterogeneous resources, user-defined constraints and different metrics. We evaluate our solution both on virtual machines using synthetic workloads, and on the Eurora HPC with production workloads. Tests on a wide range of operating conditions show significant improvements in waitings and QoS in mid-tier HPC machines w.r.t state-of-the-art commercial rule-based dispatchers. Furthermore, we analyze the conditions under which our approach outperforms commercial approaches, to create a portfolio of scheduling algorithms that ensures robustness, flexibility and scalability.
- Published
- 2016