1. Scientific Workflow Management on Hybrid Clouds with Cloud Bursting and Transparent Data Access
- Author
-
Bartosz Baliś, Michal Orzechowski, Renata Slota, Łukasz Dutka, and Jacek Kitowski
- Subjects
File system ,business.industry ,Computer science ,Data management ,Distributed computing ,020206 networking & telecommunications ,Provisioning ,Cloud computing ,02 engineering and technology ,computer.software_genre ,Data access ,Workflow ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,business ,Distributed File System ,computer ,Workflow management system - Abstract
Cloud bursting is an application deployment model wherein additional computing resources are provisioned from public clouds in cases where local resources are not sufficient, e.g. during peak demand periods. We propose and experimentally evaluate a cloud-bursting solution for scientific workflows. Our solution is portable thanks to using Kubernetes for deployment of the workflow management system and computing clusters in multiple clouds. We also introduce transparent data access by employing a virtual distributed file system across the clouds, allowing jobs to use a POSIX file system interface, while hiding data transfer between clouds. To balance load distribution and minimize the communication volume between clouds, we leverage graph partitioning, while ensuring that the algorithm distributes the load equally at each parallel execution stage of a workflow. The solution is experimentally evaluated using the HyperFlow workflow management system integrated with the Onedata data management platform, deployed in our on-premise cloud in Cyfronet AGH and in the Google Cloud.
- Published
- 2021