TY - JOUR
T1 - High-Performance ECC Scalar Multiplication Architecture Based on Comb Method and Low-Latency Window Recoding Algorithm
AU - Zhang, Jingqi
AU - Chen, Zhiming
AU - Ma, Mingzhi
AU - Jiang, Rongkun
AU - Li, Hongshuo
AU - Wang, Weijiang
N1 - Publisher Copyright:
© 1993-2012 IEEE.
PY - 2024/2/1
Y1 - 2024/2/1
N2 - Elliptic curve scalar multiplication (ECSM) is the essential operation in elliptic curve cryptography (ECC) for achieving high performance and security. We introduce a novel high-performance ECSM architecture over binary fields to meet the growing demand for performance and security. A low-latency window (LLW) recoding algorithm for hardware implementation is proposed to enhance the resistance toward side-channel attacks (SCAs). Based on the LLW algorithm, we propose an enhanced comb method for ECSM with a unified point addition (PA) and point doubling (PD) pattern. The theoretical analysis demonstrates that the enhanced comb method with w = 4 strikes the balance of computation burden for both extreme cases. To achieve short clock cycle latency and high frequency, the data dependency of ECSM is thoroughly analyzed, and we explore a timing schedule with one two-stage pipelined Karatsuba multiplier accumulator (MAC). The datapath of the proposed architecture is well-designed, ensuring that the critical path (CP) only contains minimal logic primitives apart from the MAC. Besides, the ideal placement of pipeline stages for MAC is illustrated. The proposed architecture has been implemented on Xilinx Virtex-7 series field-programmable gate arrays (FPGAs) and performs ECSM in 2.51, 4.93, and 10.85 μ s with 3422,7983, and 20158 slices over GF(2163), GF(2283), and GF(2571),respectively. Implementation results reveal that our design shows 53.60%, 39.36%, and 32.64% performance improvement over the existing state-of-the-art works, respectively.
AB - Elliptic curve scalar multiplication (ECSM) is the essential operation in elliptic curve cryptography (ECC) for achieving high performance and security. We introduce a novel high-performance ECSM architecture over binary fields to meet the growing demand for performance and security. A low-latency window (LLW) recoding algorithm for hardware implementation is proposed to enhance the resistance toward side-channel attacks (SCAs). Based on the LLW algorithm, we propose an enhanced comb method for ECSM with a unified point addition (PA) and point doubling (PD) pattern. The theoretical analysis demonstrates that the enhanced comb method with w = 4 strikes the balance of computation burden for both extreme cases. To achieve short clock cycle latency and high frequency, the data dependency of ECSM is thoroughly analyzed, and we explore a timing schedule with one two-stage pipelined Karatsuba multiplier accumulator (MAC). The datapath of the proposed architecture is well-designed, ensuring that the critical path (CP) only contains minimal logic primitives apart from the MAC. Besides, the ideal placement of pipeline stages for MAC is illustrated. The proposed architecture has been implemented on Xilinx Virtex-7 series field-programmable gate arrays (FPGAs) and performs ECSM in 2.51, 4.93, and 10.85 μ s with 3422,7983, and 20158 slices over GF(2163), GF(2283), and GF(2571),respectively. Implementation results reveal that our design shows 53.60%, 39.36%, and 32.64% performance improvement over the existing state-of-the-art works, respectively.
KW - Elliptic curve cryptography (ECC)
KW - elliptic curve scalar multiplication (ECSM)
KW - field-programmable gate array (FPGA)
UR - http://www.scopus.com/inward/record.url?scp=85179113426&partnerID=8YFLogxK
U2 - 10.1109/TVLSI.2023.3321772
DO - 10.1109/TVLSI.2023.3321772
M3 - Article
AN - SCOPUS:85179113426
SN - 1063-8210
VL - 32
SP - 382
EP - 395
JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IS - 2
ER -