1. Automatic Discovery of Collective Communication Patterns in Parallelized Task Graphs.
- Author
-
Knorr, Fabian, Salzmann, Philip, Thoman, Peter, and Fahringer, Thomas
- Subjects
- *
COMMUNICATION patterns , *DEGREES of freedom , *HIGH performance computing - Abstract
Collective communication APIs equip MPI vendors with the necessary context to optimize cluster-wide operations on the basis of theoretical complexity models and characteristics of the involved interconnects. Modern HPC runtime systems with a programmability focus can perform dependency analysis to eliminate the need for manual communication entirely. Profiting from optimized collective routines in this context often requires global analysis of the implicit point-to-point communication pattern or tight constrains on the data access patterns allowed inside kernels. The Celerity API provides a high degree of freedom for both runtime implementors and application developers by tieing transparent work assignment to data access patterns through user-defined range-mapper functions. Canonically, data dependencies are resolved through an intra-node coherence model and inter-node point-to-point communication. This paper presents Collective Pattern Discovery (CPD), a fully distributed, coordination-free method for detecting collective communication patterns on parallelized task graphs. Through extensive scheduling and communication microbenchmarks as well as a strong scaling experiment on a compute-intensive application, we demonstrate that CPD can achieve substantial performance gains in the Celerity model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF