TY - GEN
T1 - P4Com
T2 - 10th International Conference on Computer and Communication Systems, ICCCS 2025
AU - Xiang, Boyuan
AU - Chi, Cheng
AU - Gao, Shuai
AU - Li, Haonan
AU - Hou, Xindi
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Geo-Distributed Machine Learning (GDML) aims to enable datacenters to collaborate in training large-scale models. These datacenters are located in different geographic regions. However, the limited Wide Area Network (WAN) bandwidth resources restrict the performance of GDML systems. Existing solutions compromise model accuracy, while in-network computing-based approaches suffer from issues such as complex processing logic, high system implementation overhead, and a high probability of gradient aggregation conflicts. Therefore, this paper proposes a lightweight in-network data reduction mechanism, P4Com, based on the P4 programmable data plane. First, we design the network-layer identifier to represent gradient aggregation tasks. Then, a register conflict avoidance mechanism is proposed to improve register utilization efficiency. Building upon this, we design a lightweight data plane using the protocol-independent P4 language to support line-rate in-network gradient aggregation. Finally, A prototype system is built for validation, and experimental results show that P4Com significantly reduces the runtime latency of GDML systems.
AB - Geo-Distributed Machine Learning (GDML) aims to enable datacenters to collaborate in training large-scale models. These datacenters are located in different geographic regions. However, the limited Wide Area Network (WAN) bandwidth resources restrict the performance of GDML systems. Existing solutions compromise model accuracy, while in-network computing-based approaches suffer from issues such as complex processing logic, high system implementation overhead, and a high probability of gradient aggregation conflicts. Therefore, this paper proposes a lightweight in-network data reduction mechanism, P4Com, based on the P4 programmable data plane. First, we design the network-layer identifier to represent gradient aggregation tasks. Then, a register conflict avoidance mechanism is proposed to improve register utilization efficiency. Building upon this, we design a lightweight data plane using the protocol-independent P4 language to support line-rate in-network gradient aggregation. Finally, A prototype system is built for validation, and experimental results show that P4Com significantly reduces the runtime latency of GDML systems.
KW - GDML
KW - In-network computing
KW - P4
KW - Programmable data plane
UR - https://www.scopus.com/pages/publications/105011975409
U2 - 10.1109/ICCCS65393.2025.11069944
DO - 10.1109/ICCCS65393.2025.11069944
M3 - Conference contribution
AN - SCOPUS:105011975409
T3 - 10th International Conference on Computer and Communication Systems, ICCCS 2025
SP - 8
EP - 13
BT - 10th International Conference on Computer and Communication Systems, ICCCS 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 April 2025 through 21 April 2025
ER -