Back to Search Start Over

MPC: A Unified Parallel Runtime for Clusters of NUMA Machines

Authors :
Hervé Jourdren
Raymond Namyst
Marc Pérache
DAM Île-de-France (DAM/DIF)
Direction des Applications Militaires (DAM)
Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)
Laboratoire Bordelais de Recherche en Informatique (LaBRI)
Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)
Springer
Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)
Source :
the 14th International Euro-Par Conference, the 14th International Euro-Par Conference, Aug 2008, Las Palmas de Gran Canaria, Spain. pp.78-88, ⟨10.1007/978-3-540-85451-7_9⟩, Lecture Notes in Computer Science ISBN: 9783540854500, Euro-Par
Publication Year :
2008
Publisher :
HAL CCSD, 2008.

Abstract

Over the last decade, Message Passing Interface (MPI) has become a very successful parallel programming environment for distributed memory architectures such as clusters. However, the architecture of cluster node is currently evolving from small symmetric shared memory multiprocessors towards massively multicore, Non-Uniform Memory Access (NUMA) hardware. Although regular MPI implementations are using numerous optimizations to realize zero copycache-oblivious data transfers within shared-memory nodes, they might prevent applications from achieving most of the hardware's performance simply because the scheduling of heavyweight processes is not flexible enough to dynamically fit the underlying hardware topology. This explains why several research efforts have investigated hybrid approaches mixing message passing between nodes and memory sharing inside nodes, such as MPI+OpenMP solutions [1,2]. However, these approaches require lots of programming efforts in order to adapt/rewrite existing MPI applications. In this paper, we present the MultiProcessor Communications environnement (MPC), which aims at providing programmers with an efficient runtime system for their existing MPI, POSIX Thread or hybrid MPI+Thread applications. The key idea is to use user-level threads instead of processes over multiprocessor cluster nodes to increase scheduling flexibility, to better control memory allocations and optimize scheduling of the communication flows with other nodes. Most existing MPI applications can run over MPC with no modification. We obtained substantial gains (up to 20%) by using MPC instead of a regular MPI runtime on several scientific applications.

Details

Language :
English
ISBN :
978-3-540-85450-0
ISBNs :
9783540854500
Database :
OpenAIRE
Journal :
the 14th International Euro-Par Conference, the 14th International Euro-Par Conference, Aug 2008, Las Palmas de Gran Canaria, Spain. pp.78-88, ⟨10.1007/978-3-540-85451-7_9⟩, Lecture Notes in Computer Science ISBN: 9783540854500, Euro-Par
Accession number :
edsair.doi.dedup.....b92b6d01d0d4fc852c3e67128b1ef631
Full Text :
https://doi.org/10.1007/978-3-540-85451-7_9⟩