TY - GEN
T1 - Massively parallel Implementation of Multilevel Fast Multipole Algorithm on Sunway TaihuLight
AU - Liu, Xin Duo
AU - He, Wei Jia
AU - Yang, Ming Lin
AU - Sheng, Xin Qing
N1 - Publisher Copyright:
© 2023 Applied Computational Electromagnetics Society (ACES).
PY - 2023
Y1 - 2023
N2 - We present in this paper a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on homegrown Sunway TaihuLight supercomputer with SW26010 heterogeneous many-core processors, noted as (SW-PMLFMA), for 3-D electromagnetic scattering problems. In the proposed parallel implementation, the multilevel fast multipole algorithm (MLFMA) octree is first partitioned among management processing elements (MPEs) of SW26010 processors following the ternary partitioning scheme using the message passing interface (MPI). Then the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation, etc., are accelerated by using all the 64 computing processing elements (CPEs) in the same core group of the MPE via the Athread programming model. Different parallelization strategies are designed for many-core accelerators to ensures a high computational throughput. Numerical results are included to demonstrate the efficiency and versatility of the proposed method.
AB - We present in this paper a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on homegrown Sunway TaihuLight supercomputer with SW26010 heterogeneous many-core processors, noted as (SW-PMLFMA), for 3-D electromagnetic scattering problems. In the proposed parallel implementation, the multilevel fast multipole algorithm (MLFMA) octree is first partitioned among management processing elements (MPEs) of SW26010 processors following the ternary partitioning scheme using the message passing interface (MPI). Then the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation, etc., are accelerated by using all the 64 computing processing elements (CPEs) in the same core group of the MPE via the Athread programming model. Different parallelization strategies are designed for many-core accelerators to ensures a high computational throughput. Numerical results are included to demonstrate the efficiency and versatility of the proposed method.
KW - Multilevel fast multipole algorithm
KW - SW26010 processor
KW - distributed memory parallelization
KW - electromagnetic scattering
KW - many-core acceleration
UR - http://www.scopus.com/inward/record.url?scp=85174252907&partnerID=8YFLogxK
U2 - 10.23919/ACES-China60289.2023.10249919
DO - 10.23919/ACES-China60289.2023.10249919
M3 - Conference contribution
AN - SCOPUS:85174252907
T3 - 2023 International Applied Computational Electromagnetics Society Symposium, ACES-China 2023
BT - 2023 International Applied Computational Electromagnetics Society Symposium, ACES-China 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 International Applied Computational Electromagnetics Society Symposium, ACES-China 2023
Y2 - 15 August 2023 through 18 August 2023
ER -