Back to Search Start Over

OPTWEB: A Lightweight Fully Connected Inter-FPGA Network for Efficient Collectives.

Authors :
Mizutani, Kenji
Yamaguchi, Hiroshi
Urino, Yutaka
Koibuchi, Michihiro
Source :
IEEE Transactions on Computers. Jun2021, Vol. 70 Issue 6, p849-862. 14p.
Publication Year :
2021

Abstract

Modern FPGA accelerators can be equipped with many high-bandwidth network I/Os, e.g., 64 x 50 Gbps, enabled by onboard optics or co-packaged optics. Some dozens of tightly coupled FPGA accelerators form an emerging computing platform for distributed data processing. However, a conventional indirect packet network using Ethernet's Intellectual Properties imposes an unacceptably large amount of the logic for handling such high-bandwidth interconnects on an FPGA. Besides the indirect network, another approach builds a direct packet network. Existing direct inter-FPGA networks have a low-radix network topology, e.g., 2-D torus. However, the low-radix network has the disadvantage of a large diameter and large average shortest path length that increases the latency of collectives. To mitigate both problems, we propose a lightweight, fully connected inter-FPGA network called OPTWEB for efficient collectives. Since all end-to-end separate communication paths are statically established using onboard optics, raw block data can be transferred with simple link-level synchronization. Once each source FPGA assigns a communication stream to a path by its internal switch logic between memory-mapped and stream interfaces for remote direct memory access (RDMA), a one-hop transfer is provided. Since each FPGA performs input/output of the remote memory access between all FPGAs simultaneously, multiple RDMAs efficiently form collectives. The OPTWEB network provides 0.71-μsec start-up latency of collectives among multiple Intel Stratix 10 MX FPGA cards with onboard optics. The OPTWEB network consumes 31.4 and 57.7 percent of adaptive logic modules for aggregate 400-Gbps and 800-Gbps interconnects on a custom Stratix 10 MX 2100 FPGA, respectively. The OPTWEB network reduces by 40 percent the cost compared to a conventional packet network. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00189340
Volume :
70
Issue :
6
Database :
Academic Search Index
Journal :
IEEE Transactions on Computers
Publication Type :
Academic Journal
Accession number :
150448421
Full Text :
https://doi.org/10.1109/TC.2021.3068715