TY - JOUR
T1 - High-Throughput Energy-Efficient Accelerator With Collaborative-Trainable Sparse-Quantization Method for On-Board Remote Sensing Processing
AU - Wang, Tong
AU - Chen, He
AU - Zhang, Ning
AU - Ni, Shuo
AU - Zhang, Xi
AU - Chen, Liang
AU - Li, Wei
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Convolutional neural networks (CNNs) have achieved remarkable breakthroughs on remote sensing tasks in recent years. However, deploying CNNs for real-time remote sensing on-board processing still remains a challenge due to power consumption, real-time, and other limitations. Therefore, in this article, a satellite-based real-time remote sensing accelerator is proposed, where algorithm and hardware approaches are proposed to jointly optimize CNNs’ deployment on edge-side aerospace devices. First, a collaborative-trainable sparse-quantization (CTSQ) method is proposed to reduce the model’s storage overhead. In the CTSQ method, analysis of the errors is performed for the sparsity-quantization composition. Besides, the interchannel correlations among parameters are leveraged, where the structured sparsity and quantization are performed with fine-grained units. Second, a modular-system co-optimized (MoSyC) architecture is proposed. A hardware-mapped sparse access (HMSA) strategy is proposed to effectively filter out zero elements in sparse parameters. Moreover, a high-throughput architecture is designed for parallel and pipelined data flow control. Finally, extensive experiments are conducted on both scene classification and object detection tasks with ResNet and YOLOv5 models. The results show that the proposed CTSQ method achieves the compression ratio of more than 13.81× , and the proposed MoSyC architecture achieves the throughput of more than 1815 giga operations per second (GOPS), demonstrating the effectiveness of the proposed accelerator.
AB - Convolutional neural networks (CNNs) have achieved remarkable breakthroughs on remote sensing tasks in recent years. However, deploying CNNs for real-time remote sensing on-board processing still remains a challenge due to power consumption, real-time, and other limitations. Therefore, in this article, a satellite-based real-time remote sensing accelerator is proposed, where algorithm and hardware approaches are proposed to jointly optimize CNNs’ deployment on edge-side aerospace devices. First, a collaborative-trainable sparse-quantization (CTSQ) method is proposed to reduce the model’s storage overhead. In the CTSQ method, analysis of the errors is performed for the sparsity-quantization composition. Besides, the interchannel correlations among parameters are leveraged, where the structured sparsity and quantization are performed with fine-grained units. Second, a modular-system co-optimized (MoSyC) architecture is proposed. A hardware-mapped sparse access (HMSA) strategy is proposed to effectively filter out zero elements in sparse parameters. Moreover, a high-throughput architecture is designed for parallel and pipelined data flow control. Finally, extensive experiments are conducted on both scene classification and object detection tasks with ResNet and YOLOv5 models. The results show that the proposed CTSQ method achieves the compression ratio of more than 13.81× , and the proposed MoSyC architecture achieves the throughput of more than 1815 giga operations per second (GOPS), demonstrating the effectiveness of the proposed accelerator.
KW - Convolutional neural network (CNN)
KW - field programmable gate array (FPGA)
KW - quantization
KW - real-time
KW - remote sensing
KW - sparsity
UR - https://www.scopus.com/pages/publications/105018041085
U2 - 10.1109/TGRS.2025.3616011
DO - 10.1109/TGRS.2025.3616011
M3 - Article
AN - SCOPUS:105018041085
SN - 0196-2892
VL - 63
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5646518
ER -