TY - JOUR
T1 - Hyper-Parallel Superscalar Asynchronous RISC-V Processor Based on Event-Driven Logic
AU - Zhao, Kangli
AU - He, Anping
AU - Ma, Jun
AU - Zhang, Lixin
AU - Zhu, Lixian
AU - Dong, Qunxi
AU - Tian, Fuze
AU - Zhou, Qingguo
AU - Zhao, Qinglin
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2025
Y1 - 2025
N2 - Event-driven neuromorphic computing involves sparse and asynchronous signal activity, which leads to irregular computation patterns and fine-grained concurrency. As a result, processing architectures need to support both high parallelism and energy efficiency. Among existing architectural solutions, superscalar designs exhibit significant potential for addressing high parallelism demands. However, conventional superscalar processors, which rely on synchronous circuits, maintain high-frequency clocking at all times, leading to substantial power inefficiency in sparse computation scenarios. To address this issue, we propose an asynchronous superscalar architecture that replaces global clocking with fully local handshake-based control, implemented using a bundled-data asynchronous protocol. The design supports decoding of up to 64 scalar instructions per cycle and implements the RISC-V RV32IMC instruction set. A prototype was fabricated using a 110 nm complementary metal oxide semiconductor (CMOS) process and was evaluated through post-layout simulation. Operating at 1.2 V, the processor delivers a peak INT8 throughput of 669.4 GOPS, with a static power consumption of 421 mW.
AB - Event-driven neuromorphic computing involves sparse and asynchronous signal activity, which leads to irregular computation patterns and fine-grained concurrency. As a result, processing architectures need to support both high parallelism and energy efficiency. Among existing architectural solutions, superscalar designs exhibit significant potential for addressing high parallelism demands. However, conventional superscalar processors, which rely on synchronous circuits, maintain high-frequency clocking at all times, leading to substantial power inefficiency in sparse computation scenarios. To address this issue, we propose an asynchronous superscalar architecture that replaces global clocking with fully local handshake-based control, implemented using a bundled-data asynchronous protocol. The design supports decoding of up to 64 scalar instructions per cycle and implements the RISC-V RV32IMC instruction set. A prototype was fabricated using a 110 nm complementary metal oxide semiconductor (CMOS) process and was evaluated through post-layout simulation. Operating at 1.2 V, the processor delivers a peak INT8 throughput of 669.4 GOPS, with a static power consumption of 421 mW.
KW - Asynchronous superscalar architecture
KW - RISC-V
KW - instruction-level parallelism
KW - neuromorphic computing
KW - parallel computing
UR - https://www.scopus.com/pages/publications/105026114445
U2 - 10.1109/TCSS.2025.3632912
DO - 10.1109/TCSS.2025.3632912
M3 - Article
AN - SCOPUS:105026114445
SN - 2329-924X
JO - IEEE Transactions on Computational Social Systems
JF - IEEE Transactions on Computational Social Systems
ER -