Back to Search
Start Over
Data mining on vast data sets as a cluster system benchmark
- Source :
- Concurrency and Computation: Practice and Experience. 28:2145-2165
- Publication Year :
- 2015
- Publisher :
- Wiley, 2015.
-
Abstract
- Comparing different accelerated cluster architectures by a single application is a tough piece of work because this application has to be optimized with respect to platform-dependent features. In this work, we demonstrate such an optimization for a data mining algorithm which solves regression and classification problems on vast data sets. Our technique is based on least squares regression, and its major component is the iterative matrix-free solution of a linear system of equations. By processing data sets ranging from several hundreds of thousands instances to multi-million data points in strong-scaling and weak-scaling settings, we are able to estimate the amount of parallelism needed to unleash the performance of classic CPU-based machines and clusters employing Intel Xeon Phi coprocessors and NVIDIA Kepler GPUs. Only in strong-scaling experiments, GPUs and coprocessors suffer from their tremendous amount of needed parallelism and get outperformed by dual socket Intel Sandy Bridge nodes at large scale more than 64 nodes/accelerators. However, in weak-scaling scenarios, a speed-up larger than 2X over an entire CPU node can be achieved by a single accelerator. Copyright © 2015 John Wiley & Sons, Ltd.
- Subjects :
- Coprocessor
Computer Networks and Communications
Computer science
Scale (descriptive set theory)
02 engineering and technology
Parallel computing
Computer Science Applications
Theoretical Computer Science
Computer Science::Performance
Computational Theory and Mathematics
020204 information systems
Component (UML)
Node (computer science)
0202 electrical engineering, electronic engineering, information engineering
Benchmark (computing)
020201 artificial intelligence & image processing
Software
Xeon Phi
Subjects
Details
- ISSN :
- 15320626
- Volume :
- 28
- Database :
- OpenAIRE
- Journal :
- Concurrency and Computation: Practice and Experience
- Accession number :
- edsair.doi...........5442d23df33759c169670d2663164cb0