High-Performance ECC Scalar Multiplication Architecture Based on Comb Method and Low-Latency Window Recoding Algorithm

Jingqi Zhang; Zhiming Chen; Mingzhi Ma; Rongkun Jiang; Hongshuo Li; Weijiang Wang

doi:10.1109/TVLSI.2023.3321772

High-Performance ECC Scalar Multiplication Architecture Based on Comb Method and Low-Latency Window Recoding Algorithm

Jingqi Zhang, Zhiming Chen, Mingzhi Ma, Rongkun Jiang, Hongshuo Li, Weijiang Wang^*

^*Corresponding author for this work

School of Integrated Circuits and Electronics

Research output: Contribution to journal › Article › peer-review

4 Citations (Scopus)

Abstract

Elliptic curve scalar multiplication (ECSM) is the essential operation in elliptic curve cryptography (ECC) for achieving high performance and security. We introduce a novel high-performance ECSM architecture over binary fields to meet the growing demand for performance and security. A low-latency window (LLW) recoding algorithm for hardware implementation is proposed to enhance the resistance toward side-channel attacks (SCAs). Based on the LLW algorithm, we propose an enhanced comb method for ECSM with a unified point addition (PA) and point doubling (PD) pattern. The theoretical analysis demonstrates that the enhanced comb method with w = 4 strikes the balance of computation burden for both extreme cases. To achieve short clock cycle latency and high frequency, the data dependency of ECSM is thoroughly analyzed, and we explore a timing schedule with one two-stage pipelined Karatsuba multiplier accumulator (MAC). The datapath of the proposed architecture is well-designed, ensuring that the critical path (CP) only contains minimal logic primitives apart from the MAC. Besides, the ideal placement of pipeline stages for MAC is illustrated. The proposed architecture has been implemented on Xilinx Virtex-7 series field-programmable gate arrays (FPGAs) and performs ECSM in 2.51, 4.93, and 10.85 μ s with 3422,7983, and 20158 slices over GF(2163), GF(2283), and GF(2571),respectively. Implementation results reveal that our design shows 53.60%, 39.36%, and 32.64% performance improvement over the existing state-of-the-art works, respectively.

Original language	English
Pages (from-to)	382-395
Number of pages	14
Journal	IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume	32
Issue number	2
DOIs	https://doi.org/10.1109/TVLSI.2023.3321772
Publication status	Published - 1 Feb 2024

Keywords

Elliptic curve cryptography (ECC)
elliptic curve scalar multiplication (ECSM)
field-programmable gate array (FPGA)

Access to Document

10.1109/TVLSI.2023.3321772

Cite this

@article{494e323f5e5c4d009cedf8dfe8b98c49,

title = "High-Performance ECC Scalar Multiplication Architecture Based on Comb Method and Low-Latency Window Recoding Algorithm",

abstract = "Elliptic curve scalar multiplication (ECSM) is the essential operation in elliptic curve cryptography (ECC) for achieving high performance and security. We introduce a novel high-performance ECSM architecture over binary fields to meet the growing demand for performance and security. A low-latency window (LLW) recoding algorithm for hardware implementation is proposed to enhance the resistance toward side-channel attacks (SCAs). Based on the LLW algorithm, we propose an enhanced comb method for ECSM with a unified point addition (PA) and point doubling (PD) pattern. The theoretical analysis demonstrates that the enhanced comb method with w = 4 strikes the balance of computation burden for both extreme cases. To achieve short clock cycle latency and high frequency, the data dependency of ECSM is thoroughly analyzed, and we explore a timing schedule with one two-stage pipelined Karatsuba multiplier accumulator (MAC). The datapath of the proposed architecture is well-designed, ensuring that the critical path (CP) only contains minimal logic primitives apart from the MAC. Besides, the ideal placement of pipeline stages for MAC is illustrated. The proposed architecture has been implemented on Xilinx Virtex-7 series field-programmable gate arrays (FPGAs) and performs ECSM in 2.51, 4.93, and 10.85 μ s with 3422,7983, and 20158 slices over GF(2163), GF(2283), and GF(2571),respectively. Implementation results reveal that our design shows 53.60%, 39.36%, and 32.64% performance improvement over the existing state-of-the-art works, respectively.",

keywords = "Elliptic curve cryptography (ECC), elliptic curve scalar multiplication (ECSM), field-programmable gate array (FPGA)",

author = "Jingqi Zhang and Zhiming Chen and Mingzhi Ma and Rongkun Jiang and Hongshuo Li and Weijiang Wang",

note = "Publisher Copyright: {\textcopyright} 1993-2012 IEEE.",

year = "2024",

month = feb,

day = "1",

doi = "10.1109/TVLSI.2023.3321772",

language = "English",

volume = "32",

pages = "382--395",

journal = "IEEE Transactions on Very Large Scale Integration (VLSI) Systems",

issn = "1063-8210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "2",

}

TY - JOUR

T1 - High-Performance ECC Scalar Multiplication Architecture Based on Comb Method and Low-Latency Window Recoding Algorithm

AU - Zhang, Jingqi

AU - Chen, Zhiming

AU - Ma, Mingzhi

AU - Jiang, Rongkun

AU - Li, Hongshuo

AU - Wang, Weijiang

PY - 2024/2/1

Y1 - 2024/2/1

N2 - Elliptic curve scalar multiplication (ECSM) is the essential operation in elliptic curve cryptography (ECC) for achieving high performance and security. We introduce a novel high-performance ECSM architecture over binary fields to meet the growing demand for performance and security. A low-latency window (LLW) recoding algorithm for hardware implementation is proposed to enhance the resistance toward side-channel attacks (SCAs). Based on the LLW algorithm, we propose an enhanced comb method for ECSM with a unified point addition (PA) and point doubling (PD) pattern. The theoretical analysis demonstrates that the enhanced comb method with w = 4 strikes the balance of computation burden for both extreme cases. To achieve short clock cycle latency and high frequency, the data dependency of ECSM is thoroughly analyzed, and we explore a timing schedule with one two-stage pipelined Karatsuba multiplier accumulator (MAC). The datapath of the proposed architecture is well-designed, ensuring that the critical path (CP) only contains minimal logic primitives apart from the MAC. Besides, the ideal placement of pipeline stages for MAC is illustrated. The proposed architecture has been implemented on Xilinx Virtex-7 series field-programmable gate arrays (FPGAs) and performs ECSM in 2.51, 4.93, and 10.85 μ s with 3422,7983, and 20158 slices over GF(2163), GF(2283), and GF(2571),respectively. Implementation results reveal that our design shows 53.60%, 39.36%, and 32.64% performance improvement over the existing state-of-the-art works, respectively.

AB - Elliptic curve scalar multiplication (ECSM) is the essential operation in elliptic curve cryptography (ECC) for achieving high performance and security. We introduce a novel high-performance ECSM architecture over binary fields to meet the growing demand for performance and security. A low-latency window (LLW) recoding algorithm for hardware implementation is proposed to enhance the resistance toward side-channel attacks (SCAs). Based on the LLW algorithm, we propose an enhanced comb method for ECSM with a unified point addition (PA) and point doubling (PD) pattern. The theoretical analysis demonstrates that the enhanced comb method with w = 4 strikes the balance of computation burden for both extreme cases. To achieve short clock cycle latency and high frequency, the data dependency of ECSM is thoroughly analyzed, and we explore a timing schedule with one two-stage pipelined Karatsuba multiplier accumulator (MAC). The datapath of the proposed architecture is well-designed, ensuring that the critical path (CP) only contains minimal logic primitives apart from the MAC. Besides, the ideal placement of pipeline stages for MAC is illustrated. The proposed architecture has been implemented on Xilinx Virtex-7 series field-programmable gate arrays (FPGAs) and performs ECSM in 2.51, 4.93, and 10.85 μ s with 3422,7983, and 20158 slices over GF(2163), GF(2283), and GF(2571),respectively. Implementation results reveal that our design shows 53.60%, 39.36%, and 32.64% performance improvement over the existing state-of-the-art works, respectively.

KW - Elliptic curve cryptography (ECC)

KW - elliptic curve scalar multiplication (ECSM)

KW - field-programmable gate array (FPGA)

UR - http://www.scopus.com/inward/record.url?scp=85179113426&partnerID=8YFLogxK

U2 - 10.1109/TVLSI.2023.3321772

DO - 10.1109/TVLSI.2023.3321772

M3 - Article

AN - SCOPUS:85179113426

SN - 1063-8210

VL - 32

SP - 382

EP - 395

JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

IS - 2

ER -

High-Performance ECC Scalar Multiplication Architecture Based on Comb Method and Low-Latency Window Recoding Algorithm

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this