1. HotCluster: A Thermal-Aware Defect Recovery Method for Through-Silicon-Vias Toward Reliable 3-D ICs Systems
- Author
-
Akram Ben Ahmed, Xuan-Tu Tran, Abderazek Ben Abdallah, and Khanh N. Dang
- Subjects
Router ,Through-silicon via ,Computer science ,Reliability (computer networking) ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,Fault (power engineering) ,Computer Graphics and Computer-Aided Design ,020202 computer hardware & architecture ,Computer engineering ,Hardware_INTEGRATEDCIRCUITS ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Sensitivity (control systems) ,Electrical and Electronic Engineering ,Cluster analysis ,Software ,Electronic circuit - Abstract
Through Silicon Via (TSV) is considered as the near-future solution to realize low-power and high-performance 3D-Integrated Circuits (3D-ICs) and 3D-Network-on-Chips (3D-NoCs). However, the lifetime reliability issue of TSV due to its fault sensitivity and the high operating temperature of 3D-ICs, which also accelerates the fault-rate, is one of the most critical challenges. Meanwhile, most current works focus on detecting and correcting TSV defects after manufacturing without considering high-temperature nodes’ impact on lifetime reliability. Besides, the recovery for defective clusters is also challenging because of costly redundancies. In this work, we present HotCluster: a hotspot-aware self-correction platform for clustering defects in 3D-NoCs to help understand and tackle this problem. We first give a method to predict normalized fault rates and place redundant TSV groups according to each region’s fault rate. In our particular medium fault-rate (normalized to the coolest area), HotCluster reduces about 60% of the redundancies in comparison to the uniformly distributed redundancies while having a higher ratio of router working in a normal state. Furthermore, HotCluster integrates both online (weight-based) and offline (max-flow min-cut offline method) mapping algorithms to help the system correct the faulty TSV clusters. The experimental results show that both the max-flow min-cut offline method and weight-based online mode with a redundancy of 0.25 exhibits less than 1% of routers disabled under 50% defect-rates.
- Published
- 2022
- Full Text
- View/download PDF