1. Revealing Untapped DSP Optimization Potentials for FPGA-Based Systolic Matrix Engines
- Author
-
Li, Jindong, Li, Tenglong, Shen, Guobin, Zhao, Dongcheng, Zhang, Qian, and Zeng, Yi
- Subjects
Computer Science - Hardware Architecture - Abstract
Systolic architectures are widely embraced by neural network accelerators for their superior performance in highly parallelized computation. The DSP48E2s serve as dedicated arithmetic blocks in Xilinx Ultrascale series FPGAs and constitute a fundamental component in FPGA-based systolic matrix engines. Harnessing the full potential of DSP48E2s in architectural design can result in significant performance enhancements for systolic architectures on Ultrascale series FPGAs. This paper unveils several previously untapped DSP optimization techniques capable of further enhancing FPGA-based systolic matrix engines. We apply these techniques to two well-known systolic architectures: Google TPUv1 and Xilinx Vitis AI DPU. With the proposed techniques, our design achieves substantial resource and power reduction compared to the open-source TPUv1 FPGA implementation and the Vitis AI DPU implementation in the same parallelism setting. We also demonstrate the applicability of our techniques to neuromorphic hardware for supporting spiking neural network acceleration., Comment: Accepted by FPL2024
- Published
- 2024