A 28-nm 19.9-to-258.5-TOPS/W 8b Digital Computing-in-Memory Processor With Two-Cycle Macro Featuring Winograd-Domain Convolution and Macro-Level Parallel Dual-Side Sparsity

Hao Wu, Yong Chen, Yiyang Yuan, Jinshan Yue, Xinghua Wang, Xiaoran Li, Feng Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

Recently, computing in memory (CiM) has been proven to be an energy-efficient and promising architecture for artificial intelligence (AI) algorithms. And yet, current CiM schemes generally suffer from limited throughput compared to their digital counterparts, and the key reason is that the CiM macro calculation must iterate through multiple cycles. Thus, the need to reduce the calculation cycle of the macro while keeping high energy efficiency and the necessity of developing acceleration methods for the universal CiM-based processor have become major issues faced by the current CiM architectures. To surmount these critical problems, we propose a processor based on a two-cycle CiM macro. Our work makes three main contributions: 1) we present a Radix16-based digital-CiM macro with look-up table (LUT) optimization to reduce dynamic power consumption; 2) we devise a hybrid Winograd microarchitecture and dataflow that supports (2, 3) and (4, 3) Winograd convolution, meaning that a good compromise can be reached between the accuracy of the algorithm and the reduction in workload; and 3) we propose a macrolevel parallel dual-side sparse CiM core that uses a horizontal direction compression method to reduce the input cycle of activation data and improve the mapping efficiency of the weight data in the macros. A prototype of the processor is fabricated in a 28-nm CMOS, which achieves a peak system energy efficiency of 19.9&#x2013;258.5-TOPS/W for a voltage supply of 0.6&#x2013;1.1 V, and an operating frequency of 78&#x2013;287 MHz, a 2.55&#x2013;7.08<inline-formula> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> higher than other state-of-the-art CiM processors.

Original languageEnglish
Pages (from-to)1-15
Number of pages15
JournalIEEE Journal of Solid-State Circuits
DOIs
Publication statusAccepted/In press - 2024

Keywords

  • Accuracy
  • Artificial intelligence
  • Artificial intelligence (AI)
  • Circuits
  • CMOS
  • computing-in-memory (CiM)
  • Energy efficiency
  • energy efficiency
  • look-up table (LUT)
  • multiply-accumulation (MAC)
  • neural network (NN)
  • Power demand
  • Radix16
  • Table lookup
  • Throughput
  • unstructured sparsity
  • Winograd convolution

Fingerprint

Dive into the research topics of 'A 28-nm 19.9-to-258.5-TOPS/W 8b Digital Computing-in-Memory Processor With Two-Cycle Macro Featuring Winograd-Domain Convolution and Macro-Level Parallel Dual-Side Sparsity'. Together they form a unique fingerprint.

Cite this