High-Performance Evaluation of the Interpolations and Anterpolations in the GPU-Accelerated Massively Parallel MLFMA

Wei Jia He; Zeng Yang; Xiao Wei Huang; Wu Wang; Ming Lin Yang; Xin Qing Sheng

doi:10.1109/TAP.2023.3269106

High-Performance Evaluation of the Interpolations and Anterpolations in the GPU-Accelerated Massively Parallel MLFMA

Wei Jia He, Zeng Yang, Xiao Wei Huang, Wu Wang, Ming Lin Yang^*, Xin Qing Sheng

^*此作品的通讯作者

集成电路与电子学院

科研成果: 期刊稿件 › 文章 › 同行评审

4 引用（Scopus）

摘要

This communication investigates high-performance computation schemes for local Lagrange interpolation and anterpolation operations in the parallel graphics processing unit (GPU)-accelerated distributed-memory multilevel fast multipole algorithm (MLFMA). Two ELLPACK format-based schemes, namely, block ELLPACK (ELL-B) and hybrid compressed sparse column (CSC)-ELL-B (CSC-ELL-B), are proposed for the evaluation of interpolation and anterpolation operations, respectively, which ensure high computational throughput for GPU calculation. Optimization using the GPU hierarchical memory architecture, the mechanism of the stream, and the central processing unit (CPU)/GPU asynchronous computation pattern are employed to further improve the overall performance. The proposed schemes are proven to be an order of magnitude faster than the conventional schemes for aggregation/disaggregation operations. For an aircraft model involving over 10 billion unknowns, the iteration time is reduced by over half, which is remarkable progress in the development of GPU-accelerated parallelization of MLFMA.

源语言	英语
页（从-至）	6231-6236
页数	6
期刊	IEEE Transactions on Antennas and Propagation
卷	71
期	7
DOI	https://doi.org/10.1109/TAP.2023.3269106
出版状态	已出版 - 1 7月 2023

访问文件

10.1109/TAP.2023.3269106

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{d87c09974a4d45a7b24711f5a49db3d3,

title = "High-Performance Evaluation of the Interpolations and Anterpolations in the GPU-Accelerated Massively Parallel MLFMA",

abstract = "This communication investigates high-performance computation schemes for local Lagrange interpolation and anterpolation operations in the parallel graphics processing unit (GPU)-accelerated distributed-memory multilevel fast multipole algorithm (MLFMA). Two ELLPACK format-based schemes, namely, block ELLPACK (ELL-B) and hybrid compressed sparse column (CSC)-ELL-B (CSC-ELL-B), are proposed for the evaluation of interpolation and anterpolation operations, respectively, which ensure high computational throughput for GPU calculation. Optimization using the GPU hierarchical memory architecture, the mechanism of the stream, and the central processing unit (CPU)/GPU asynchronous computation pattern are employed to further improve the overall performance. The proposed schemes are proven to be an order of magnitude faster than the conventional schemes for aggregation/disaggregation operations. For an aircraft model involving over 10 billion unknowns, the iteration time is reduced by over half, which is remarkable progress in the development of GPU-accelerated parallelization of MLFMA.",

keywords = "Graphics processing unit (GPU), large-scale electromagnetic scattering, multilevel fast multipole algorithm (MLFMA), parallel",

author = "He, {Wei Jia} and Zeng Yang and Huang, {Xiao Wei} and Wu Wang and Yang, {Ming Lin} and Sheng, {Xin Qing}",

note = "Publisher Copyright: {\textcopyright} 1963-2012 IEEE.",

year = "2023",

month = jul,

day = "1",

doi = "10.1109/TAP.2023.3269106",

language = "English",

volume = "71",

pages = "6231--6236",

journal = "IEEE Transactions on Antennas and Propagation",

issn = "0018-926X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "7",

}

TY - JOUR

T1 - High-Performance Evaluation of the Interpolations and Anterpolations in the GPU-Accelerated Massively Parallel MLFMA

AU - He, Wei Jia

AU - Yang, Zeng

AU - Huang, Xiao Wei

AU - Wang, Wu

AU - Yang, Ming Lin

AU - Sheng, Xin Qing

PY - 2023/7/1

Y1 - 2023/7/1

N2 - This communication investigates high-performance computation schemes for local Lagrange interpolation and anterpolation operations in the parallel graphics processing unit (GPU)-accelerated distributed-memory multilevel fast multipole algorithm (MLFMA). Two ELLPACK format-based schemes, namely, block ELLPACK (ELL-B) and hybrid compressed sparse column (CSC)-ELL-B (CSC-ELL-B), are proposed for the evaluation of interpolation and anterpolation operations, respectively, which ensure high computational throughput for GPU calculation. Optimization using the GPU hierarchical memory architecture, the mechanism of the stream, and the central processing unit (CPU)/GPU asynchronous computation pattern are employed to further improve the overall performance. The proposed schemes are proven to be an order of magnitude faster than the conventional schemes for aggregation/disaggregation operations. For an aircraft model involving over 10 billion unknowns, the iteration time is reduced by over half, which is remarkable progress in the development of GPU-accelerated parallelization of MLFMA.

AB - This communication investigates high-performance computation schemes for local Lagrange interpolation and anterpolation operations in the parallel graphics processing unit (GPU)-accelerated distributed-memory multilevel fast multipole algorithm (MLFMA). Two ELLPACK format-based schemes, namely, block ELLPACK (ELL-B) and hybrid compressed sparse column (CSC)-ELL-B (CSC-ELL-B), are proposed for the evaluation of interpolation and anterpolation operations, respectively, which ensure high computational throughput for GPU calculation. Optimization using the GPU hierarchical memory architecture, the mechanism of the stream, and the central processing unit (CPU)/GPU asynchronous computation pattern are employed to further improve the overall performance. The proposed schemes are proven to be an order of magnitude faster than the conventional schemes for aggregation/disaggregation operations. For an aircraft model involving over 10 billion unknowns, the iteration time is reduced by over half, which is remarkable progress in the development of GPU-accelerated parallelization of MLFMA.

KW - Graphics processing unit (GPU)

KW - large-scale electromagnetic scattering

KW - multilevel fast multipole algorithm (MLFMA)

KW - parallel

UR - http://www.scopus.com/inward/record.url?scp=85159646134&partnerID=8YFLogxK

U2 - 10.1109/TAP.2023.3269106

DO - 10.1109/TAP.2023.3269106

M3 - Article

AN - SCOPUS:85159646134

SN - 0018-926X

VL - 71

SP - 6231

EP - 6236

JO - IEEE Transactions on Antennas and Propagation

JF - IEEE Transactions on Antennas and Propagation

IS - 7

ER -

High-Performance Evaluation of the Interpolations and Anterpolations in the GPU-Accelerated Massively Parallel MLFMA

摘要

访问文件

其它文件与链接

指纹

引用此