Abstract
A many-core parallel approach of the multilevel fast multipole algorithm (MLFMA) based on the Athread parallel programming model is presented on the homegrown many-core SW26010 CPU of China. In the proposed many-core implementation of MLFMA, the data access efficiency is improved by using data structures based on the structure of array. The adaptive workload distribution strategies are adopted on different MLFMA tree levels to ensure full utilization of computing capability and the scratchpad memory. A double buffering scheme is specially designed to make communication overlapped computation. The resulting Athread-based many-core implementation of the MLFMA is capable of solving real-life problems with over one million unknowns with a remarkable speedup. The capability and efficiency of the proposed method are analyzed through the examples of computing scattering by spheres and a practical aerocraft. Numerical results show that with the proposed parallel scheme, the total speedup ratios from 6.4 to 8.0 can be achieved, compared with the CPU master core.
Original language | English |
---|---|
Pages (from-to) | 1502-1516 |
Number of pages | 15 |
Journal | Journal of Supercomputing |
Volume | 77 |
Issue number | 2 |
DOIs | |
Publication status | Published - Feb 2021 |
Keywords
- 3D scattering
- Many-core parallelization
- Multilevel fast multipole algorithm
- Surface integral equations
- Sw26010 processor