Abstract
Emerging In-Network Computing (INC) technique provides a new opportunity to improve application’s performance by using network programmability, computational capability, and storage capacity enabled by programmable switches. One typical application is Distributed Machine Learning (DML), which accelerates machine learning training by employing multiple works to train model parallelly. This paper introduces INC-based DML systems, analyzes performance improvement from using INC, and overviews current studies of INC-based DML systems. We also propose potential research directions for applying INC to DML systems.
Original language | English |
---|---|
Pages (from-to) | 1 |
Number of pages | 1 |
Journal | IEEE Network |
DOIs | |
Publication status | Accepted/In press - 2024 |
Keywords
- Computational modeling
- Data models
- Distributed Machine Learning
- In-Network Computing
- Machine Learning
- Machine learning
- Performance evaluation
- Programmable Switch
- Servers
- Synchronization
- Training
Fingerprint
Dive into the research topics of 'When In-Network Computing Meets Distributed Machine Learning'. Together they form a unique fingerprint.Cite this
Zhu, H., Jiang, W., Hong, Q., & Guo, Z. (Accepted/In press). When In-Network Computing Meets Distributed Machine Learning. IEEE Network, 1. https://doi.org/10.1109/MNET.2024.3368138