1. Libra
- Author
-
Guangwen Yang, Zihong Lv, Haohuan Fu, and Mengyao Jin
- Subjects
Speedup ,Computer science ,Computation ,020206 networking & telecommunications ,02 engineering and technology ,Parallel computing ,010502 geochemistry & geophysics ,01 natural sciences ,Code transformation ,Factor (programming language) ,Limit (music) ,0202 electrical engineering, electronic engineering, information engineering ,Parallelism (grammar) ,Code generation ,Implementation ,computer ,0105 earth and related environmental sciences ,computer.programming_language - Abstract
Stencils account for a significant part in many scientific computing applications. Besides simple stencils which can be completed with a few arithmetic operations, there are also many register-limited stencils with hundreds or thousands of variables and operations. The massive registers required by these stencils largely limit the parallelism of the programs on current many-core architectures, and consequently degrade the overall performance. Based on the register usage, which is the major constraining factor for most register-limited stencils, we propose a DDG (data-dependency-graph) oriented code transformation approach to improve the performance of these stencils. This approach analyzes, reorders and transforms the original program on GPUs, and further explores for the best tradeoff between the computation amount and the parallelism degree. Based on our graphoriented code transformation approach, we further design and implement an automated code generation and tuning framework called Libra, to improve the productivity and performance simultaneously. We apply Libra to 5 widely used stencils, and experiment results show that these stencils achieve a speedup of 1.12~2.16X when compared with the original fairly-optimized implementations.
- Published
- 2016
- Full Text
- View/download PDF