1. Parallelizing with BDSC, a resource-constrained scheduling algorithm for shared and distributed memory systems
- Author
-
Corinne Ancourt, Dounia Khaldi, Pierre Jouvelot, Department of computer Science [Houston], Rice University [Houston], Centre de Recherche en Informatique (CRI), MINES ParisTech - École nationale supérieure des mines de Paris, and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)
- Subjects
Schedule ,Shared mem-ory ,Computer Networks and Communications ,Computer science ,Distributed computing ,Task parallelism ,[INFO.PAR]Computer Science [cs]/domain_info.par ,Parallel computing ,computer.software_genre ,Theoretical Computer Science ,Resource (project management) ,Artificial Intelligence ,DSC algorithm ,Computer Graphics and Computer-Aided Design ,Static scheduling ,Distributed memory ,Automatic parallelization ,PIPS ,Shared memory ,Hardware and Architecture ,Distributed memory systems ,Compiler ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,Resource management (computing) ,computer ,Software - Abstract
BDSC schedules parallel programs in the presence of resource constraints.BDSC-based parallelization relies on static program analyses for cost modeling.BDSC-based parallelization yields significant speedups on parallel architectures. We introduce a new parallelization framework for scientific computing based on BDSC, an efficient automatic scheduling algorithm for parallel programs in the presence of resource constraints on the number of processors and their local memory size. BDSC extends Yang and Gerasoulis's Dominant Sequence Clustering (DSC) algorithm; it uses sophisticated cost models and addresses both shared and distributed parallel memory architectures. We describe BDSC, its integration within the PIPS compiler infrastructure and its application to the parallelization of four well-known scientific applications: Harris, ABF, equake and IS. Our experiments suggest that BDSC's focus on efficient resource management leads to significant parallelization speedups on both shared and distributed memory systems, improving upon DSC results, as shown by the comparison of the sequential and parallelized versions of these four applications running on both OpenMP and MPI frameworks.
- Published
- 2015
- Full Text
- View/download PDF