Back to Search Start Over

Post-Silicon Heat-Source Identification and Machine-Learning-Based Thermal Modeling Using Infrared Thermal Imaging.

Authors :
Sadiqbatcha, Sheriff
Zhang, Jinwei
Zhao, Hengyang
Amrouch, Hussam
Henkel, Jorg
Tan, Sheldon X.-D.
Source :
IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems; Apr2021, Vol. 40 Issue 4, p694-707, 14p
Publication Year :
2021

Abstract

In this article, we present a novel post-silicon approach to locating the dominant heat sources on commercial multicore processors using heatmaps measured via an infrared (IR) thermal imaging setup. To locate the heat sources, 2-D spatial Laplacian transformation is performed on the heatmaps followed by K-means clustering to find the dominant power/heat-source clusters. This is an exclusively post-silicon approach that does not require any knowledge of the underlying design of the commercial chips other than the information that is publicly available. Since the identified clusters are the thermally vulnerable areas on the die, we then propose a machine-learning-based framework to deriving a thermal model capable of estimating their temperatures during online use. Our approach involves collecting transient temperature data of the aforementioned heat sources and synchronized high-level performance metrics from the chip, and training a long-short-term-memory (LSTM) neural network (NN) that uses the performance metrics as inputs to estimate the temperatures of the identified heat sources in real time. Since the model is meant for real-time use, we explore methods of reducing the performance overhead and inference time of the model. This includes a novel power correlation-based approach to identifying the thermally irrelevant performance metrics and eliminating them in order to reduce the input dimensionality of the model, and an analysis on network sizing to determine the ideal NN configuration for the problem at hand. The model is trained and tested exclusively using measured thermal data from commercial multicore processors. The experimental results from two Intel multicore processors (i5-3337U and i7-8650U) show that the proposed approach achieves very high accuracy (root-mean-square error: 0.55 °C–0.93 °C) in estimating the temperatures of all the identified heat sources on the chip. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02780070
Volume :
40
Issue :
4
Database :
Complementary Index
Journal :
IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems
Publication Type :
Academic Journal
Accession number :
149510212
Full Text :
https://doi.org/10.1109/TCAD.2020.3007541