Back to Search Start Over

A Low-Cost Floating-Point FMA Unit Supporting Package Operations for HPC-AI Applications

Authors :
Tan, Hongbing
Zhang, Jing
He, Xiaowei
Huang, Libo
Wang, Yongwen
Xiao, Liquan
Source :
Circuits and Systems II: Express Briefs, IEEE Transactions on; 2024, Vol. 71 Issue: 7 p3488-3492, 5p
Publication Year :
2024

Abstract

The convergence of HPC and AI has brought about a diversification of precision, posing significant hardware implementation challenges. This brief aims to address this issue by presenting a low-cost floating-point (FP) fused multiply-add (FMA) unit that is capable of supporting a wide range of FP formats. For the fewer-than-64-bit formats, this innovative FMA unit performs standard or mixed-precision operations fully pipelined in parallel for SP, TF32, BF16, and HP formats. For the 64-bit DP format, the FMA and ADD operations, whether independent or data-related, can be organized into package operations that are executed in two consecutive cycles to eliminate pipeline stall and then improve performance. The proposed FMA unit utilizes iteration and hardware vectorization methods to balance between cost and performance. Compared to a conventional DP FMA unit, the proposed design not only supports a wider range of FP formats and functions but also achieves higher performance with less cost. It can improve performance up to 1.5x more than the dual-mode FMA unit when performing HPC-AI applications.

Details

Language :
English
ISSN :
15497747 and 15583791
Volume :
71
Issue :
7
Database :
Supplemental Index
Journal :
Circuits and Systems II: Express Briefs, IEEE Transactions on
Publication Type :
Periodical
Accession number :
ejs66894048
Full Text :
https://doi.org/10.1109/TCSII.2024.3359678