51. Integrating Parallelizing Compilation Technologies for SMP Clusters
- Author
-
Yi-Ran Wang, Chun-Lei Sang, Xiao-Mi An, Xiaobing Feng, Li Chen, Lin Ma, and Zhaoqing Zhang
- Subjects
Profiling (computer programming) ,Loop optimization ,Speedup ,Fortran ,Computer science ,Parallel computing ,computer.software_genre ,Directive ,Computer Science Applications ,Theoretical Computer Science ,Computational Theory and Mathematics ,Computer architecture ,Hardware and Architecture ,Robustness (computer science) ,Scalability ,Profile-guided optimization ,Compiler ,computer ,Implementation ,Software ,computer.programming_language ,Compiler correctness - Abstract
In this paper, a source to source parallelizing complier system, AutoPar, is presentd. The system transforms FORTRAN programs to multi-level hybrid MPI/OpenMP parallel programs. Integrated parallel optimizing technologies are utilized extensively to derive an effective program decomposition in the whole program scope. Other features such as synchronization optimization and communication optimization improve the performance scalability of the generated parallel programs, from both intra-node and inter-node. The system makes great effort to boost automation of parallelization. Profiling feedback is used in performance estimation which is the basis of automatic program decomposition. Performance results for eight benchmarks in NPB1.0 from NAS on an SMP cluster are given, and the speedup is desirable. It is noticeable that in the experiment, at most one data distribution directive and a reduction directive are inserted by the user in BT/SP/LU. The compiler is based on ORC, Open Research Compiler. ORC is a powerful compiler infrastructure, with such features as robustness, flexibility and efficiency. Strong analysis capability and well-defined infrastructure of ORC make the system implementation quite fast.
- Published
- 2005
- Full Text
- View/download PDF