Back to Search
Start Over
Accelerating batched 1D-FFT with a CUDA-capable computer
- Source :
- 2010 IEEE International Conference on Imaging Systems and Techniques.
- Publication Year :
- 2010
- Publisher :
- IEEE, 2010.
-
Abstract
- Summarizing we like to make the following concluding remarks: • We have assembled a low-cost CUDA-capable desktop PC, reflecting the PC state-of-the-art of about 1 1 over 2 years ago. • Via the Ubuntu 9.10 Linux operating system we could enable CUDA by installing a recent Linux NVIDIA driver and the CUDA software (version 2.3). • By applying the Java-bindings based JCuda software package we could call CUFFT library functions from a Java environment. • We could easily perform batched (multiple) 1D-FFT in a parallel fashion by exploiting the batch facility of CUFFT 1D-FFT for a CUDA-enabled GPU device. In this way we could avoid for statement looping, needed for the (CPU-based) reference method. • We could speed up the batched 1D-FFT execution time by about a factor of 20 by applying the GPU-based rather than the CPU-based approach. • Easy comparison of Java-based and ‘C for CUDA’-based benchmarking appeared to be hindered by the choices made for the JCuda implementation. • The CUDA-based benchmark results, reported in this work, seemed to be limited by the data-transfer bandwidth of the computer PCI Express 2.0×16 bus. • If data-transfer speed indeed is the limiting factor, significant computational accelerations can only be achieved if major parts of the numerical calculations can be carried out in the CUDA GPUs. • In the context of the latter, enhanced double-precision and amount of local memory of recent/future CUDA devices will become important. • Using CUDA-based batched 1D-FFT, we could carry out a sample user-guided exhaustive-search in MRS parameter space.
Details
- Database :
- OpenAIRE
- Journal :
- 2010 IEEE International Conference on Imaging Systems and Techniques
- Accession number :
- edsair.doi...........bced84370e5522de5b777a4e3df4b1bf