TY - GEN
T1 - Eudoxus
T2 - 27th Annual IEEE International Symposium on High Performance Computer Architecture, HPCA 2021
AU - Gan, Yiming
AU - Bo, Yu
AU - Tian, Boyuan
AU - Xu, Leimeng
AU - Hu, Wei
AU - Liu, Shaoshan
AU - Liu, Qiang
AU - Zhang, Yanjun
AU - Tang, Jie
AU - Zhu, Yuhao
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/2
Y1 - 2021/2
N2 - We develop and commercialize autonomous machines, such as logistic robots and self-driving cars, around the globe. A critical challenge to our-and any-autonomous machine is accurate and efficient localization under resource constraints, which has fueled specialized localization accelerators recently. Prior acceleration efforts are point solutions in that they each specialize for a specific localization algorithm. In real-world commercial deployments, however, autonomous machines routinely operate under different environments and no single localization algorithm fits all the environments. Simply stacking together point solutions not only leads to cost and power budget overrun, but also results in an overly complicated software stack.This paper demonstrates our new software-hardware co-designed framework for autonomous machine localization, which adapts to different operating scenarios by fusing fundamental algorithmic primitives. Through characterizing the software framework, we identify ideal acceleration candidates that contribute significantly to the end-To-end latency and/or latency variation. We show how to co-design a hardware accelerator to systematically exploit the parallelisms, locality, and common building blocks inherent in the localization framework. We build, deploy, and evaluate an FPGA prototype on our next-generation self-driving cars. To demonstrate the flexibility of our framework, we also instantiate another FPGA prototype targeting drones, which represent mobile autonomous machines. We achieve about 2 \times speedup and 4 \times energy reduction compared to widely-deployed, optimized implementations on general-purpose platforms.
AB - We develop and commercialize autonomous machines, such as logistic robots and self-driving cars, around the globe. A critical challenge to our-and any-autonomous machine is accurate and efficient localization under resource constraints, which has fueled specialized localization accelerators recently. Prior acceleration efforts are point solutions in that they each specialize for a specific localization algorithm. In real-world commercial deployments, however, autonomous machines routinely operate under different environments and no single localization algorithm fits all the environments. Simply stacking together point solutions not only leads to cost and power budget overrun, but also results in an overly complicated software stack.This paper demonstrates our new software-hardware co-designed framework for autonomous machine localization, which adapts to different operating scenarios by fusing fundamental algorithmic primitives. Through characterizing the software framework, we identify ideal acceleration candidates that contribute significantly to the end-To-end latency and/or latency variation. We show how to co-design a hardware accelerator to systematically exploit the parallelisms, locality, and common building blocks inherent in the localization framework. We build, deploy, and evaluate an FPGA prototype on our next-generation self-driving cars. To demonstrate the flexibility of our framework, we also instantiate another FPGA prototype targeting drones, which represent mobile autonomous machines. We achieve about 2 \times speedup and 4 \times energy reduction compared to widely-deployed, optimized implementations on general-purpose platforms.
KW - n/a
UR - http://www.scopus.com/inward/record.url?scp=85104964462&partnerID=8YFLogxK
U2 - 10.1109/HPCA51647.2021.00074
DO - 10.1109/HPCA51647.2021.00074
M3 - Conference contribution
AN - SCOPUS:85104964462
T3 - Proceedings - International Symposium on High-Performance Computer Architecture
SP - 827
EP - 840
BT - Proceeding - 27th IEEE International Symposium on High Performance Computer Architecture, HPCA 2021
PB - IEEE Computer Society
Y2 - 27 February 2021 through 1 March 2021
ER -