Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems

Jonathan Vincent; Jing Gong; Martin Karp; Adam Peplinski; Niclas Jansson; Artur Podobas; Andreas Jocksch; Jie Yao; Fazle Hussain; Stefano Markidis; Matts Karlsson; Dirk Pleiter; Erwin Laure; Philipp Schlatter

doi:10.1145/3492805.3492818

Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems

Jonathan Vincent, Jing Gong, Martin Karp, Adam Peplinski, Niclas Jansson, Artur Podobas, Andreas Jocksch, Jie Yao, Fazle Hussain, Stefano Markidis, Matts Karlsson, Dirk Pleiter, Erwin Laure, Philipp Schlatter

KTH Royal Institute of Technology

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

4 引用（Scopus）

摘要

We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000. The test case considered consists of a direct numerical simulation of fully-developed turbulent flow in a straight pipe, at two different Reynolds numbers Reτ = 360 and Reτ = 550, based on friction velocity and pipe radius. The strong scaling is tested on several GPU-enabled HPC systems, including the Swiss Piz Daint system, TACC's Longhorn, Jülich's JUWELS Booster, and Berzelius in Sweden. The performance results show that speed-up between 3-5 can be achieved using the GPU accelerated version compared with the CPU version on these different systems. The run-time for 20 timesteps reduces from 43.5 to 13.2 seconds with increasing the number of GPUs from 64 to 512 for Reτ = 550 case on JUWELS Booster system. This illustrates the GPU accelerated version the potential for high throughput. At the same time, the strong scaling limit is significantly larger for GPUs, at about 2000 - 5000 elements per rank; compared to about 50 - 100 for a CPU-rank.

源语言	英语
主期刊名	Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022
出版商	Association for Computing Machinery
页	94-102
页数	9
ISBN（电子版）	9781450384988
DOI	https://doi.org/10.1145/3492805.3492818
出版状态	已出版 - 7 1月 2022
已对外发布	是
活动	5th International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022 - Virtual, Online, 日本期限: 12 1月 2022 → 14 1月 2022

出版系列

姓名	ACM International Conference Proceeding Series

会议

会议	5th International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022
国家/地区	日本
市	Virtual, Online
时期	12/01/22 → 14/01/22

访问文件

10.1145/3492805.3492818

其它文件与链接

链接到 Scopus 的出版物

引用此

Vincent, J., Gong, J., Karp, M., Peplinski, A., Jansson, N., Podobas, A., Jocksch, A., Yao, J., Hussain, F., Markidis, S., Karlsson, M., Pleiter, D., Laure, E., & Schlatter, P. (2022). Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems. 在 Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022 (页码 94-102). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3492805.3492818

@inproceedings{0a2624c0a0c6471ebb34199dbcb76b2f,

title = "Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems",

abstract = "We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000. The test case considered consists of a direct numerical simulation of fully-developed turbulent flow in a straight pipe, at two different Reynolds numbers Reτ = 360 and Reτ = 550, based on friction velocity and pipe radius. The strong scaling is tested on several GPU-enabled HPC systems, including the Swiss Piz Daint system, TACC's Longhorn, J{\"u}lich's JUWELS Booster, and Berzelius in Sweden. The performance results show that speed-up between 3-5 can be achieved using the GPU accelerated version compared with the CPU version on these different systems. The run-time for 20 timesteps reduces from 43.5 to 13.2 seconds with increasing the number of GPUs from 64 to 512 for Reτ = 550 case on JUWELS Booster system. This illustrates the GPU accelerated version the potential for high throughput. At the same time, the strong scaling limit is significantly larger for GPUs, at about 2000 - 5000 elements per rank; compared to about 50 - 100 for a CPU-rank.",

keywords = "Benchmarking, Computational Fluid Dynamics, Nek5000, OpenACC, Scaling",

author = "Jonathan Vincent and Jing Gong and Martin Karp and Adam Peplinski and Niclas Jansson and Artur Podobas and Andreas Jocksch and Jie Yao and Fazle Hussain and Stefano Markidis and Matts Karlsson and Dirk Pleiter and Erwin Laure and Philipp Schlatter",

note = "Publisher Copyright: {\textcopyright} 2022 ACM.; 5th International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022 ; Conference date: 12-01-2022 Through 14-01-2022",

year = "2022",

month = jan,

day = "7",

doi = "10.1145/3492805.3492818",

language = "English",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery",

pages = "94--102",

booktitle = "Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022",

}

Vincent, J, Gong, J, Karp, M, Peplinski, A, Jansson, N, Podobas, A, Jocksch, A, Yao, J, Hussain, F, Markidis, S, Karlsson, M, Pleiter, D, Laure, E & Schlatter, P 2022, Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems. 在 Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022. ACM International Conference Proceeding Series, Association for Computing Machinery, 页码 94-102, 5th International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022, Virtual, Online, 日本, 12/01/22. https://doi.org/10.1145/3492805.3492818

Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems. / Vincent, Jonathan; Gong, Jing; Karp, Martin 等.
Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022. Association for Computing Machinery, 2022. 页码 94-102 (ACM International Conference Proceeding Series).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems

AU - Vincent, Jonathan

AU - Gong, Jing

AU - Karp, Martin

AU - Peplinski, Adam

AU - Jansson, Niclas

AU - Podobas, Artur

AU - Jocksch, Andreas

AU - Yao, Jie

AU - Hussain, Fazle

AU - Markidis, Stefano

AU - Karlsson, Matts

AU - Pleiter, Dirk

AU - Laure, Erwin

AU - Schlatter, Philipp

PY - 2022/1/7

Y1 - 2022/1/7

N2 - We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000. The test case considered consists of a direct numerical simulation of fully-developed turbulent flow in a straight pipe, at two different Reynolds numbers Reτ = 360 and Reτ = 550, based on friction velocity and pipe radius. The strong scaling is tested on several GPU-enabled HPC systems, including the Swiss Piz Daint system, TACC's Longhorn, Jülich's JUWELS Booster, and Berzelius in Sweden. The performance results show that speed-up between 3-5 can be achieved using the GPU accelerated version compared with the CPU version on these different systems. The run-time for 20 timesteps reduces from 43.5 to 13.2 seconds with increasing the number of GPUs from 64 to 512 for Reτ = 550 case on JUWELS Booster system. This illustrates the GPU accelerated version the potential for high throughput. At the same time, the strong scaling limit is significantly larger for GPUs, at about 2000 - 5000 elements per rank; compared to about 50 - 100 for a CPU-rank.

AB - We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000. The test case considered consists of a direct numerical simulation of fully-developed turbulent flow in a straight pipe, at two different Reynolds numbers Reτ = 360 and Reτ = 550, based on friction velocity and pipe radius. The strong scaling is tested on several GPU-enabled HPC systems, including the Swiss Piz Daint system, TACC's Longhorn, Jülich's JUWELS Booster, and Berzelius in Sweden. The performance results show that speed-up between 3-5 can be achieved using the GPU accelerated version compared with the CPU version on these different systems. The run-time for 20 timesteps reduces from 43.5 to 13.2 seconds with increasing the number of GPUs from 64 to 512 for Reτ = 550 case on JUWELS Booster system. This illustrates the GPU accelerated version the potential for high throughput. At the same time, the strong scaling limit is significantly larger for GPUs, at about 2000 - 5000 elements per rank; compared to about 50 - 100 for a CPU-rank.

KW - Benchmarking

KW - Computational Fluid Dynamics

KW - Nek5000

KW - OpenACC

KW - Scaling

UR - http://www.scopus.com/inward/record.url?scp=85122621284&partnerID=8YFLogxK

U2 - 10.1145/3492805.3492818

DO - 10.1145/3492805.3492818

M3 - Conference contribution

AN - SCOPUS:85122621284

T3 - ACM International Conference Proceeding Series

SP - 94

EP - 102

BT - Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022

PB - Association for Computing Machinery

T2 - 5th International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022

Y2 - 12 January 2022 through 14 January 2022

ER -

Vincent J, Gong J, Karp M, Peplinski A, Jansson N, Podobas A 等. Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems. 在 Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2022. Association for Computing Machinery. 2022. 页码 94-102. (ACM International Conference Proceeding Series). doi: 10.1145/3492805.3492818

Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此