Back to Search
Start Over
S aC/C formulations of the all-pairs N-body problem and their performance on SMPs and GPGPUs.
- Source :
- Concurrency & Computation: Practice & Experience; Mar2014, Vol. 26 Issue 4, p952-971, 20p
- Publication Year :
- 2014
-
Abstract
- SUMMARY This paper describes our experience in implementing the classical N-body algorithm in S aC and analysing the runtime performance achieved on three different machines: a dual-processor 8-core Dell PowerEdge 2950 (a Beowulf cluster node, the reference machine), a quad-core hyper-threaded Intel Core-i7 based system equipped with an NVidia GTX-480 graphics accelerator and an Oracle Sparc T4-4 server with a total of 256 hardware threads. We contrast our findings with those resulting from the reference C code and a few variants of it that employ OpenMP pragmas as well as explicit vectorisation. Our experiments demonstrate that the S aC implementation successfully combines a high level of abstraction, very close to the mathematical specification, with very competitive runtimes. In fact, S aC matches or outperforms the hand-vectorised and hand-parallelised C codes on all three systems under investigation without the need for any source code modification. Furthermore, only S aC is able to effectively harness the advanced compute power of the graphics accelerator, again by mere recompilation of the same source code. Our results illustrate the benefits that S aC provides to application programmers in terms of coding productivity, source code, and performance portability among different machine architectures, as well as long-term maintainability in evolving hardware environments. Copyright © 2013 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 15320626
- Volume :
- 26
- Issue :
- 4
- Database :
- Complementary Index
- Journal :
- Concurrency & Computation: Practice & Experience
- Publication Type :
- Academic Journal
- Accession number :
- 94475886
- Full Text :
- https://doi.org/10.1002/cpe.3078