Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core

Luming Wang; Defeng Chen; Dongliang Wang; Chao Wang

doi:10.1109/IMCEC55388.2022.10019890

Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core

Luming Wang, Defeng Chen, Dongliang Wang, Chao Wang

信息与电子学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

The processing speed of radar echo coherent accumulation system is an important factor affecting the real-time performance of space target detection. In this paper, based on GPU V100, adopting the concept of half-precision and tensor core, we design the radar echo coherent accumulation system and achieve the acceleration effect. The design of the system includes optimizing the process of coherent accumulation system, designing the scaling coefficient and using tcFFT library to realize FFT with the method of WMMA. We use FP32, FPl6 and FP16tensor core to compare the speed of coherent accumulation system. In FP32 and FP16, we use CUFFT library to realize FFT operation, and in FP16tensor core, we call tcFFT library to realize FFT operation. Nsight Compute is used to test the speed. The test results show that: (a) The time of creating FFT plan in tcFFT is less than CUFFT. (b) In the case of single batch, FP16 achieves 1.18X-1.39X acceleration effect compared with FP32 in the whole coherent accumulation process; In the case of multiple batches, the parallel batch processing method is proposed, and in two-dimensional FFT, compared with FP16, FP16tensor core can achieve 2.23X-3.17X acceleration effect, in the whole phase-coherent accumulation process, it can achieve 1.54X-1.77X acceleration effect.

源语言	英语
主期刊名	IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference
编辑	Bing Xu, Bing Xu
出版商	Institute of Electrical and Electronics Engineers Inc.
页	990-995
页数	6
ISBN（电子版）	9781665479677
DOI	https://doi.org/10.1109/IMCEC55388.2022.10019890
出版状态	已出版 - 2022
活动	5th IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference, IMCEC 2022 - Chongqing, 中国期限: 16 12月 2022 → 18 12月 2022

出版系列

姓名	IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference

会议

会议	5th IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference, IMCEC 2022
国家/地区	中国
市	Chongqing
时期	16/12/22 → 18/12/22

访问文件

10.1109/IMCEC55388.2022.10019890

其它文件与链接

链接到 Scopus 的出版物

引用此

Wang, L., Chen, D., Wang, D., & Wang, C. (2022). Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core. 在 B. Xu, & B. Xu (编辑), IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference (页码 990-995). (IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IMCEC55388.2022.10019890

Wang, Luming ; Chen, Defeng ; Wang, Dongliang 等. / Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core. IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference. 编辑 / Bing Xu ; Bing Xu. Institute of Electrical and Electronics Engineers Inc., 2022. 页码 990-995 (IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference).

@inproceedings{3bb6e9c5eba1448da007c8e88c7a5173,

title = "Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core",

abstract = "The processing speed of radar echo coherent accumulation system is an important factor affecting the real-time performance of space target detection. In this paper, based on GPU V100, adopting the concept of half-precision and tensor core, we design the radar echo coherent accumulation system and achieve the acceleration effect. The design of the system includes optimizing the process of coherent accumulation system, designing the scaling coefficient and using tcFFT library to realize FFT with the method of WMMA. We use FP32, FPl6 and FP16tensor core to compare the speed of coherent accumulation system. In FP32 and FP16, we use CUFFT library to realize FFT operation, and in FP16tensor core, we call tcFFT library to realize FFT operation. Nsight Compute is used to test the speed. The test results show that: (a) The time of creating FFT plan in tcFFT is less than CUFFT. (b) In the case of single batch, FP16 achieves 1.18X-1.39X acceleration effect compared with FP32 in the whole coherent accumulation process; In the case of multiple batches, the parallel batch processing method is proposed, and in two-dimensional FFT, compared with FP16, FP16tensor core can achieve 2.23X-3.17X acceleration effect, in the whole phase-coherent accumulation process, it can achieve 1.54X-1.77X acceleration effect.",

keywords = "GPU, coherent accumulation system, half precision, tensor core",

author = "Luming Wang and Defeng Chen and Dongliang Wang and Chao Wang",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.; 5th IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference, IMCEC 2022 ; Conference date: 16-12-2022 Through 18-12-2022",

year = "2022",

doi = "10.1109/IMCEC55388.2022.10019890",

language = "English",

series = "IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "990--995",

editor = "Bing Xu and Bing Xu",

booktitle = "IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference",

address = "United States",

}

Wang, L, Chen, D, Wang, D & Wang, C 2022, Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core. 在 B Xu & B Xu (编辑), IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference. IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference, Institute of Electrical and Electronics Engineers Inc., 页码 990-995, 5th IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference, IMCEC 2022, Chongqing, 中国, 16/12/22. https://doi.org/10.1109/IMCEC55388.2022.10019890

Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core. / Wang, Luming; Chen, Defeng; Wang, Dongliang 等.
IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference. 编辑 / Bing Xu; Bing Xu. Institute of Electrical and Electronics Engineers Inc., 2022. 页码 990-995 (IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core

AU - Wang, Luming

AU - Chen, Defeng

AU - Wang, Dongliang

AU - Wang, Chao

PY - 2022

Y1 - 2022

N2 - The processing speed of radar echo coherent accumulation system is an important factor affecting the real-time performance of space target detection. In this paper, based on GPU V100, adopting the concept of half-precision and tensor core, we design the radar echo coherent accumulation system and achieve the acceleration effect. The design of the system includes optimizing the process of coherent accumulation system, designing the scaling coefficient and using tcFFT library to realize FFT with the method of WMMA. We use FP32, FPl6 and FP16tensor core to compare the speed of coherent accumulation system. In FP32 and FP16, we use CUFFT library to realize FFT operation, and in FP16tensor core, we call tcFFT library to realize FFT operation. Nsight Compute is used to test the speed. The test results show that: (a) The time of creating FFT plan in tcFFT is less than CUFFT. (b) In the case of single batch, FP16 achieves 1.18X-1.39X acceleration effect compared with FP32 in the whole coherent accumulation process; In the case of multiple batches, the parallel batch processing method is proposed, and in two-dimensional FFT, compared with FP16, FP16tensor core can achieve 2.23X-3.17X acceleration effect, in the whole phase-coherent accumulation process, it can achieve 1.54X-1.77X acceleration effect.

AB - The processing speed of radar echo coherent accumulation system is an important factor affecting the real-time performance of space target detection. In this paper, based on GPU V100, adopting the concept of half-precision and tensor core, we design the radar echo coherent accumulation system and achieve the acceleration effect. The design of the system includes optimizing the process of coherent accumulation system, designing the scaling coefficient and using tcFFT library to realize FFT with the method of WMMA. We use FP32, FPl6 and FP16tensor core to compare the speed of coherent accumulation system. In FP32 and FP16, we use CUFFT library to realize FFT operation, and in FP16tensor core, we call tcFFT library to realize FFT operation. Nsight Compute is used to test the speed. The test results show that: (a) The time of creating FFT plan in tcFFT is less than CUFFT. (b) In the case of single batch, FP16 achieves 1.18X-1.39X acceleration effect compared with FP32 in the whole coherent accumulation process; In the case of multiple batches, the parallel batch processing method is proposed, and in two-dimensional FFT, compared with FP16, FP16tensor core can achieve 2.23X-3.17X acceleration effect, in the whole phase-coherent accumulation process, it can achieve 1.54X-1.77X acceleration effect.

KW - GPU

KW - coherent accumulation system

KW - half precision

KW - tensor core

UR - http://www.scopus.com/inward/record.url?scp=85147690214&partnerID=8YFLogxK

U2 - 10.1109/IMCEC55388.2022.10019890

DO - 10.1109/IMCEC55388.2022.10019890

M3 - Conference contribution

AN - SCOPUS:85147690214

T3 - IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference

SP - 990

EP - 995

BT - IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference

A2 - Xu, Bing

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 5th IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference, IMCEC 2022

Y2 - 16 December 2022 through 18 December 2022

ER -

Wang L, Chen D, Wang D, Wang C. Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core. 在 Xu B, Xu B, 编辑, IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference. Institute of Electrical and Electronics Engineers Inc. 2022. 页码 990-995. (IMCEC 2022 - IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference). doi: 10.1109/IMCEC55388.2022.10019890

Acceleration of radar echo coherent accumulation system based on half-precision format and tensor core

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此