1. X10 and APGAS at Petascale
- Author
-
Mikio Takeuchi, Benjamin Herta, Wei Zhang, Olivier Tardieu, Vijay Saraswat, Prabhanjan Kambadur, Avraham Shinnar, David Grove, David Cunningham, and Mandana Vaziri
- Subjects
Computer science ,Concurrency ,020207 software engineering ,02 engineering and technology ,Parallel computing ,Load balancing (computing) ,Supercomputer ,Computer Science Applications ,Petascale computing ,Computational Theory and Mathematics ,Hardware and Architecture ,020204 information systems ,Modeling and Simulation ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Programming paradigm ,Partitioned global address space ,IBM ,Software - Abstract
X10 is a high-performance, high-productivity programming language aimed at large-scale distributed and shared-memory parallel applications. It is based on the Asynchronous Partitioned Global Address Space (APGAS) programming model, supporting the same fine-grained concurrency mechanisms within and across shared-memory nodes. We demonstrate that X10 delivers solid performance at petascale by running (weak scaling) eight application kernels on an IBM Power--775 supercomputer utilizing up to 55,680 Power7 cores (for 1.7Pflop/s of theoretical peak performance). For the four HPC Class 2 Challenge benchmarks, X10 achieves 41% to 87% of the system’s potential at scale (as measured by IBM’s HPCC Class 1 optimized runs). We also implement K-Means, Smith-Waterman, Betweenness Centrality, and Unbalanced Tree Search (UTS) for geometric trees. Our UTS implementation is the first to scale to petaflop systems. We describe the advances in distributed termination detection, distributed load balancing, and use of high-performance interconnects that enable X10 to scale out to tens of thousands of cores. We discuss how this work is driving the evolution of the X10 language, core class libraries, and runtime systems.
- Published
- 2016