1. Dynamically Negotiating Capacity Between On-demand and Batch Clusters
- Author
-
Feng Liu, Pierre Riteau, Jon Weissman, and Kate Keahey
- Subjects
020203 distributed computing ,business.industry ,Computer science ,Distributed computing ,Cloud computing ,Context (language use) ,02 engineering and technology ,computer.software_genre ,Metacomputing ,Workflow ,Grid computing ,Analytics ,0202 electrical engineering, electronic engineering, information engineering ,Cluster (physics) ,020201 artificial intelligence & image processing ,business ,Cluster analysis ,computer - Abstract
In the era of rapid experimental expansion data analysis needs are rapidly outpacing the capabilities of small institutional clusters and looking to integrate HPC resources into their workflow. We propose one way of reconciling on-demand needs of experimental analytics with the batch managed HPC resources within a system that dynamically moves nodes between an on-demand cluster configured with cloud technology (OpenStack) and a traditional HPC cluster managed by a batch scheduler (Torque). We evaluate this system experimentally both in the context of real-life traces representing two years of a specific institutional need, and via experiments in the context of synthetic traces that capture generalized characteristics of potential batch and on-demand workloads. Our results for the real-life scenario show that our approach could reduce the current investment in on-demand infrastructure by 82% while at the same time improving the mean batch wait time almost by an order of magnitude (8x).
- Published
- 2018