Back to Search
Start Over
Importance of Explicit Vectorization for CPU and GPU Software Performance
- Publication Year :
- 2010
-
Abstract
- Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, are less frequently discussed. In this paper, we present an analysis of several optimizations done on both central processing unit (CPU) and GPU implementations of a particular computationally intensive Metropolis Monte Carlo algorithm. Explicit vectorization on the CPU and the equivalent, explicit memory coalescing, on the GPU are found to be critical to achieving good performance of this algorithm in both environments. The fully-optimized CPU version achieves a 9x to 12x speedup over the original CPU version, in addition to speedup from multi-threading. This is 2x faster than the fully-optimized GPU version.<br />17 pages, 17 figures
- Subjects :
- FOS: Computer and information sciences
Speedup
Physics and Astronomy (miscellaneous)
Computer science
Graphics processing unit
FOS: Physical sciences
Software performance testing
Parallel computing
CPU shielding
Computational science
CUDA
Computer Science::Operating Systems
Numerical Analysis
Computer Science - Performance
CPU modes
Applied Mathematics
Computational Physics (physics.comp-ph)
Computer Science Applications
Performance (cs.PF)
Computer Science::Performance
Computational Mathematics
Computer Science - Distributed, Parallel, and Cluster Computing
Modeling and Simulation
Vectorization (mathematics)
Central processing unit
Distributed, Parallel, and Cluster Computing (cs.DC)
Physics - Computational Physics
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....24e7a880ae6a7a74527d350ba167485c