101. Federated Computing for the Masses--Aggregating Resources to Tackle Large-Scale Engineering Problems
- Author
-
Jaroslaw Zola, Baskar Ganapathysubramanian, Javier Diaz-Montes, Manish Parashar, Ivan Rodero, and Yu Xie
- Subjects
General Computer Science ,Queue management system ,Computer science ,business.industry ,End user ,Distributed computing ,General Engineering ,Software-defined data center ,Cloud computing ,Resource (project management) ,Middleware ,Scalability ,business ,Throughput (business) - Abstract
The complexity of many problems in science and engineering requires computational capacity exceeding what the average user can expect from a single computational center. While many of these problems can be viewed as a set of independent tasks, their collective complexity easily requires millions of core-hours on any high-power computing (HPC) resource, and throughput that can't be sustained by a single, multiuser queuing system. An exploration of the use of aggregated HPC resources to solve large-scale engineering problems shows that it's possible to build a computational federation that's easy for end users to implement, and is elastic, resilient, and scalable. Here, the authors argue that the fusion of federated computing and real-life engineering problems can be brought to the average user if relevant middleware is provided. They report on the use of federation of 10 distributed heterogeneous HPC resources to perform a large-scale interrogation of the parameter space in the microscale fluid flow problem.
- Published
- 2014