Abstract
Emerging In-Network Computing (INC) technique provides a new opportunity to improve application’s performance by using network programmability, computational capability, and storage capacity enabled by programmable switches. One typical application is Distributed Machine Learning (DML), which accelerates machine learning training by employing multiple works to train model parallelly. This paper introduces INC-based DML systems, analyzes performance improvement from using INC, and overviews current studies of INC-based DML systems. We also propose potential research directions for applying INC to DML systems.
Original language | English |
---|---|
Pages (from-to) | 1 |
Number of pages | 1 |
Journal | IEEE Network |
DOIs | |
Publication status | Accepted/In press - 2024 |
Keywords
- Computational modeling
- Data models
- Distributed Machine Learning
- In-Network Computing
- Machine Learning
- Machine learning
- Performance evaluation
- Programmable Switch
- Servers
- Synchronization
- Training