1. Massive parallelization of multilevel fast multipole algorithm for 3-D electromagnetic scattering problems on SW26010 many-core cluster.
- Author
-
Liu, Xin-Duo, He, Wei-Jia, Yang, Ming-Lin, and Sheng, Xin-Qing
- Subjects
- *
ELECTROMAGNETIC wave scattering , *SPARSE matrices , *PARALLEL programming , *ALGORITHMS , *MESSAGE passing (Computer science) , *CACHE memory , *INTERPOLATION - Abstract
This paper presents a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on homegrown many-core SW26010 cluster of China, noted as (SW-PMLFMA), for 3-D electromagnetic scattering problems. In this approach, the multilevel fast multipole algorithm (MLFMA) octree is first partitioned among management processing elements (MPEs) of SW26010 processors following the ternary partitioning scheme using the message passing interface (MPI). Then, the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation are accelerated by using all the 64 computing processing elements (CPEs) in the same core group of the MPE via the Athread parallel programming model. Different parallelization strategies are designed for many-core accelerators to ensure a high computational throughput. In coincidence with the special characteristic of local Lagrange interpolation, the compressed sparse row (CSR) and the compressed sparse column (CSC) sparse matrix storage format is used for storing interpolation and anterpolation matrices, respectively, together with a specially designed cache mechanism of hybrid dynamic and static buffers using the scratchpad memory (SPM) to improve data access efficiency. Numerical results are included to demonstrate the efficiency and versatility of the proposed method. The proposed parallel scheme is shown to have excellent speedup. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF