Back to Search Start Over

Performance analysis and tuning for clusters with ccNUMA nodes for scientific computing -- a case study.

Authors :
Kayi, Abdullah
Kornkven, Edward
El-Ghazawi, Tarek
Al-Bahra, Samy
Newby, Gregory B.
Source :
Computer Systems Science & Engineering; Sep2009, Vol. 24 Issue 5, p291-302, 12p, 2 Diagrams, 1 Chart, 12 Graphs
Publication Year :
2009

Abstract

In the quest for higher performance and with the increasing availability of multi-core chips, many systems are currently packing more processors per node. Adopting a ccNUMA node architecture in these cases has the promise of achieving a balance between cost and performance. In this paper, a 2312 Opteron cores system based on Sun Fire servers is considered as a case study to examine the performance issues associated with such architectures. In this study, we characterize the performance behavior of the system with focus on the node level using different configurations. It will be shown that the benefits from larger nodes can be severely limited for many reasons. These reasons were isolated, the associated performance losses were assessed, and some potential solutions were proposed. With the proposed performance tunings, up to 30% application performance improvement was observed. The results revealed that such problems were mainly caused by topological imbalances, limitations of the cache coherence protocol used, operating system services distribution and the lack of intelligent management of memory affinity. In addition, provided experimental analysis can be utilized by HPC application developers in order to better understand clusters with ccNUMA nodes and also as a guideline for the use of such architectures for scientific computing. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02676192
Volume :
24
Issue :
5
Database :
Supplemental Index
Journal :
Computer Systems Science & Engineering
Publication Type :
Academic Journal
Accession number :
44897112