An ML-Accelerated Framework for Large-Scale Constrained Traffic Engineering

Cheng Gu, Xin Song, Ben Hok Ng, Qiao Xiang, Zehua Guo*, Geng Li*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Traffic engineering (TE) mechanisms are crucial for achieving optimal levels of network performance over wide-area networks across geographically distributed datacenters. Existing work on traffic engineering formulated the challenges at hand as combinatorial optimization problems, which could take hours to compute on modern wide-area network topologies at the scale of thousands of nodes. To improve the performance of TE mechanisms, we introduce DeepTE, a new TE framework based on machine learning (ML) that is designed for the best possible scalability and performance, capable of completing the computation within milliseconds with networks involving thousands of nodes, and of generating near-optimal TE policies while guaranteeing that all constraints are satisfied. DeepTE is also designed with a distributed ML model architecture, which can be horizontally scaled up to multiple GPUs for even better performance. With real-world traffic matrices, our extensive array of performance evaluations of DeepTE on various network topologies and TE problems show that DeepTE is capable of producing policies within 5% of the optimal results while offering up to 100x performance improvements over state-of-the-art traffic engineering mechanisms.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 44th International Conference on Distributed Computing Systems, ICDCS 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages47-58
Number of pages12
ISBN (Electronic)9798350386059
DOIs
Publication statusPublished - 2024
Event44th IEEE International Conference on Distributed Computing Systems, ICDCS 2024 - Jersey City, United States
Duration: 23 Jul 202426 Jul 2024

Publication series

NameProceedings - International Conference on Distributed Computing Systems
ISSN (Print)1063-6927
ISSN (Electronic)2575-8411

Conference

Conference44th IEEE International Conference on Distributed Computing Systems, ICDCS 2024
Country/TerritoryUnited States
CityJersey City
Period23/07/2426/07/24

Keywords

  • Machine Learning
  • Traffic Engineering
  • Wide-area Network

Fingerprint

Dive into the research topics of 'An ML-Accelerated Framework for Large-Scale Constrained Traffic Engineering'. Together they form a unique fingerprint.

Cite this