Back to Search
Start Over
Performance analysis and tuning for clusters with ccNUMA nodes for scientific computing -- a case study.
- Source :
- Computer Systems Science & Engineering; Sep2009, Vol. 24 Issue 5, p291-302, 12p, 2 Diagrams, 1 Chart, 12 Graphs
- Publication Year :
- 2009
-
Abstract
- In the quest for higher performance and with the increasing availability of multi-core chips, many systems are currently packing more processors per node. Adopting a ccNUMA node architecture in these cases has the promise of achieving a balance between cost and performance. In this paper, a 2312 Opteron cores system based on Sun Fire servers is considered as a case study to examine the performance issues associated with such architectures. In this study, we characterize the performance behavior of the system with focus on the node level using different configurations. It will be shown that the benefits from larger nodes can be severely limited for many reasons. These reasons were isolated, the associated performance losses were assessed, and some potential solutions were proposed. With the proposed performance tunings, up to 30% application performance improvement was observed. The results revealed that such problems were mainly caused by topological imbalances, limitations of the cache coherence protocol used, operating system services distribution and the lack of intelligent management of memory affinity. In addition, provided experimental analysis can be utilized by HPC application developers in order to better understand clusters with ccNUMA nodes and also as a guideline for the use of such architectures for scientific computing. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 02676192
- Volume :
- 24
- Issue :
- 5
- Database :
- Supplemental Index
- Journal :
- Computer Systems Science & Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 44897112