Solving Electromagnetic Scattering Problems With Tens of Billions of Unknowns Using GPU Accelerated Massively Parallel MLFMA

Wei Jia He; Zeng Yang; Xiao Wei Huang; Wu Wang; Ming Lin Yang; Xin Qing Sheng

doi:10.1109/TAP.2022.3161520

Solving Electromagnetic Scattering Problems With Tens of Billions of Unknowns Using GPU Accelerated Massively Parallel MLFMA

Wei Jia He, Zeng Yang, Xiao Wei Huang, Wu Wang, Ming Lin Yang^*, Xin Qing Sheng

^*此作品的通讯作者

集成电路与电子学院

科研成果: 期刊稿件 › 文章 › 同行评审

17 引用（Scopus）

摘要

In this article, a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on graphics processing unit (GPU) heterogeneous platform, noted as GPU-PMLFMA, is presented for solving extremely large electromagnetic scattering problems involving tens of billions of unknowns, In this approach, the flexible and efficient ternary partitioning scheme is employed at first to partition the MLFMA octree among message-passing interface (MPI) processes. Then, the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation, and so on are accelerated by using the GPU. Different parallelization strategies in coincidence with the ternary parallel MLFMA approach are designed for GPU to ensure high computational throughput. Special memory usage strategy is designed to improve computational efficiency and benefit data reusing. The CPU/GPU asynchronous computing pattern is designed with the OpenMP and compute unified device architecture (CUDA), respectively, for accelerating the CPU and GPU execution parts and computation time overlapped. GPU architecture-based optimization strategies are implemented to further improve the computational efficiency. Numerical results demonstrate that the proposed GPU-PMLFMA can achieve over three times speedup, compared with the eight-threaded conventional PMLFMA. Solutions of scattering by electrically large and complicated objects with about 24 000 wavelengths and over 41.8 billion unknowns are presented.

源语言	英语
页（从-至）	5672-5682
页数	11
期刊	IEEE Transactions on Antennas and Propagation
卷	70
期	7
DOI	https://doi.org/10.1109/TAP.2022.3161520
出版状态	已出版 - 1 7月 2022

访问文件

10.1109/TAP.2022.3161520

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{0aed52237c6040aa9bad0a8c19460b11,

title = "Solving Electromagnetic Scattering Problems With Tens of Billions of Unknowns Using GPU Accelerated Massively Parallel MLFMA",

abstract = "In this article, a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on graphics processing unit (GPU) heterogeneous platform, noted as GPU-PMLFMA, is presented for solving extremely large electromagnetic scattering problems involving tens of billions of unknowns, In this approach, the flexible and efficient ternary partitioning scheme is employed at first to partition the MLFMA octree among message-passing interface (MPI) processes. Then, the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation, and so on are accelerated by using the GPU. Different parallelization strategies in coincidence with the ternary parallel MLFMA approach are designed for GPU to ensure high computational throughput. Special memory usage strategy is designed to improve computational efficiency and benefit data reusing. The CPU/GPU asynchronous computing pattern is designed with the OpenMP and compute unified device architecture (CUDA), respectively, for accelerating the CPU and GPU execution parts and computation time overlapped. GPU architecture-based optimization strategies are implemented to further improve the computational efficiency. Numerical results demonstrate that the proposed GPU-PMLFMA can achieve over three times speedup, compared with the eight-threaded conventional PMLFMA. Solutions of scattering by electrically large and complicated objects with about 24 000 wavelengths and over 41.8 billion unknowns are presented.",

keywords = "Compute unified device architecture (CUDA), OpenMP, extremely large-scale problems, message-passing interface (MPI) parallelization, multilevel fast multipole algorithm (MLFMA), scattering problems",

author = "He, {Wei Jia} and Zeng Yang and Huang, {Xiao Wei} and Wu Wang and Yang, {Ming Lin} and Sheng, {Xin Qing}",

note = "Publisher Copyright: {\textcopyright} 1963-2012 IEEE.",

year = "2022",

month = jul,

day = "1",

doi = "10.1109/TAP.2022.3161520",

language = "English",

volume = "70",

pages = "5672--5682",

journal = "IEEE Transactions on Antennas and Propagation",

issn = "0018-926X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "7",

}

TY - JOUR

T1 - Solving Electromagnetic Scattering Problems With Tens of Billions of Unknowns Using GPU Accelerated Massively Parallel MLFMA

AU - He, Wei Jia

AU - Yang, Zeng

AU - Huang, Xiao Wei

AU - Wang, Wu

AU - Yang, Ming Lin

AU - Sheng, Xin Qing

PY - 2022/7/1

Y1 - 2022/7/1

N2 - In this article, a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on graphics processing unit (GPU) heterogeneous platform, noted as GPU-PMLFMA, is presented for solving extremely large electromagnetic scattering problems involving tens of billions of unknowns, In this approach, the flexible and efficient ternary partitioning scheme is employed at first to partition the MLFMA octree among message-passing interface (MPI) processes. Then, the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation, and so on are accelerated by using the GPU. Different parallelization strategies in coincidence with the ternary parallel MLFMA approach are designed for GPU to ensure high computational throughput. Special memory usage strategy is designed to improve computational efficiency and benefit data reusing. The CPU/GPU asynchronous computing pattern is designed with the OpenMP and compute unified device architecture (CUDA), respectively, for accelerating the CPU and GPU execution parts and computation time overlapped. GPU architecture-based optimization strategies are implemented to further improve the computational efficiency. Numerical results demonstrate that the proposed GPU-PMLFMA can achieve over three times speedup, compared with the eight-threaded conventional PMLFMA. Solutions of scattering by electrically large and complicated objects with about 24 000 wavelengths and over 41.8 billion unknowns are presented.

AB - In this article, a massively parallel approach of the multilevel fast multipole algorithm (PMLFMA) on graphics processing unit (GPU) heterogeneous platform, noted as GPU-PMLFMA, is presented for solving extremely large electromagnetic scattering problems involving tens of billions of unknowns, In this approach, the flexible and efficient ternary partitioning scheme is employed at first to partition the MLFMA octree among message-passing interface (MPI) processes. Then, the computationally intensive parts of the PMLFMA on each MPI process, matrix filling, aggregation and disaggregation, and so on are accelerated by using the GPU. Different parallelization strategies in coincidence with the ternary parallel MLFMA approach are designed for GPU to ensure high computational throughput. Special memory usage strategy is designed to improve computational efficiency and benefit data reusing. The CPU/GPU asynchronous computing pattern is designed with the OpenMP and compute unified device architecture (CUDA), respectively, for accelerating the CPU and GPU execution parts and computation time overlapped. GPU architecture-based optimization strategies are implemented to further improve the computational efficiency. Numerical results demonstrate that the proposed GPU-PMLFMA can achieve over three times speedup, compared with the eight-threaded conventional PMLFMA. Solutions of scattering by electrically large and complicated objects with about 24 000 wavelengths and over 41.8 billion unknowns are presented.

KW - Compute unified device architecture (CUDA)

KW - OpenMP

KW - extremely large-scale problems

KW - message-passing interface (MPI) parallelization

KW - multilevel fast multipole algorithm (MLFMA)

KW - scattering problems

UR - http://www.scopus.com/inward/record.url?scp=85127527174&partnerID=8YFLogxK

U2 - 10.1109/TAP.2022.3161520

DO - 10.1109/TAP.2022.3161520

M3 - Article

AN - SCOPUS:85127527174

SN - 0018-926X

VL - 70

SP - 5672

EP - 5682

JO - IEEE Transactions on Antennas and Propagation

JF - IEEE Transactions on Antennas and Propagation

IS - 7

ER -

Solving Electromagnetic Scattering Problems With Tens of Billions of Unknowns Using GPU Accelerated Massively Parallel MLFMA

摘要

访问文件

其它文件与链接

指纹

引用此