When In-Network Computing Meets Distributed Machine Learning

Haowen Zhu, Wenchao Jiang, Qi Hong, Zehua Guo

Research output: Contribution to journalArticlepeer-review

Abstract

Emerging In-Network Computing (INC) technique provides a new opportunity to improve application’s performance by using network programmability, computational capability, and storage capacity enabled by programmable switches. One typical application is Distributed Machine Learning (DML), which accelerates machine learning training by employing multiple works to train model parallelly. This paper introduces INC-based DML systems, analyzes performance improvement from using INC, and overviews current studies of INC-based DML systems. We also propose potential research directions for applying INC to DML systems.

Original languageEnglish
Pages (from-to)1
Number of pages1
JournalIEEE Network
DOIs
Publication statusAccepted/In press - 2024

Keywords

  • Computational modeling
  • Data models
  • Distributed Machine Learning
  • In-Network Computing
  • Machine Learning
  • Machine learning
  • Performance evaluation
  • Programmable Switch
  • Servers
  • Synchronization
  • Training

Fingerprint

Dive into the research topics of 'When In-Network Computing Meets Distributed Machine Learning'. Together they form a unique fingerprint.

Cite this