1. Inherently Workload-Balanced Clustered Microarchitecture
- Author
-
Antonio González, Jaume Abella, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, and Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
- Subjects
Pipelines ,Speedup ,Computer science ,Distributed computing ,Process design ,Ring network ,Workload ,Energy consumption ,Parallel computing ,Topology ,Microarchitecture ,Wire ,Microprocessadors ,Key (cryptography) ,Resource allocation ,Computer architecture ,Delay effects ,Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] ,Microprocessors ,Clocks - Abstract
The performance of clustered microarchitectures relies on steering schemes that try to find the best trade-off between workload balance and inter-cluster communication penalties. In previously proposed clustered processors, reducing communication penalties and balancing the workload are opposite targets, since improving one usually implies a detriment in the other. In this paper we propose a new clustered microarchitecture that can minimize communication penalties without compromising workload balance. The key idea is to arrange the clusters in a ring topology in such a way that results of one cluster can be forwarded to the neighbor cluster with a very short latency. In this way, minimizing communication penalties is favored when the producer of a value and its consumer are placed in adjacent clusters, which also favors workload balance. The proposed microarchitecture is shown to outperform a state-of-the-art clustered processor. For instance, for an 8-cluster configuration and just one fully pipelined unidirectional bus, 15% speedup is achieved on average for FP programs.
- Published
- 2005