Massive parallelization of multilevel fast multipole algorithm for 3-D electromagnetic scattering problems on SW26010 many-core cluster

Xin Duo Liu, Wei Jia He*, Ming Lin Yang, Xin Qing Sheng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on homegrown many-core SW26010 cluster of China, noted as (SW-PMLFMA), for 3-D electromagnetic scattering problems. In this approach, the multilevel fast multipole algorithm (MLFMA) octree is first partitioned among management processing elements (MPEs) of SW26010 processors following the ternary partitioning scheme using the message passing interface (MPI). Then, the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation are accelerated by using all the 64 computing processing elements (CPEs) in the same core group of the MPE via the Athread parallel programming model. Different parallelization strategies are designed for many-core accelerators to ensure a high computational throughput. In coincidence with the special characteristic of local Lagrange interpolation, the compressed sparse row (CSR) and the compressed sparse column (CSC) sparse matrix storage format is used for storing interpolation and anterpolation matrices, respectively, together with a specially designed cache mechanism of hybrid dynamic and static buffers using the scratchpad memory (SPM) to improve data access efficiency. Numerical results are included to demonstrate the efficiency and versatility of the proposed method. The proposed parallel scheme is shown to have excellent speedup.

Original languageEnglish
Pages (from-to)8702-8718
Number of pages17
JournalJournal of Supercomputing
Volume80
Issue number7
DOIs
Publication statusPublished - May 2024

Keywords

  • Distributed memory parallelization
  • Electromagnetic scattering
  • Many-core acceleration
  • Multilevel fast multipole algorithm
  • SW26010 processor

Fingerprint

Dive into the research topics of 'Massive parallelization of multilevel fast multipole algorithm for 3-D electromagnetic scattering problems on SW26010 many-core cluster'. Together they form a unique fingerprint.

Cite this