Asynchronous cooperative multi-agent deep reinforcement learning for joint RMSA and spectrum defragmentation in optical fiber communication networks

  • Xiao Zhang
  • , Qinghua Tian*
  • , Xiangjun Xin
  • , Yiqun Pan
  • , Haipeng Yao
  • , Fu Wang
  • , Ze Dong
  • , Xiaolong Pan
  • , Sitong Zhou
  • , Feng Tian
  • , Ran Gao
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The rapid escalation of Internet traffic and increasingly heterogeneous service demands impose stringent requirements on optical networks for dynamic resource scheduling, efficient spectrum utilization, and automated operation. Elastic optical networks (EONs) are regarded as a promising solution, yet their performance remains constrained by two critical challenges: routing, modulation format, and spectrum allocation (RMSA) and spectrum defragmentation (SD). Existing approaches predominantly focus on optimizing one of these tasks, which may lead to limited adaptability and suboptimal network efficiency. To address this gap, we propose an asynchronous cooperative multi-agent deep reinforcement learning framework, termed MADRL-JRASD, for the joint optimization of RMSA and proactive SD. The framework incorporates an RMSA agent with dynamic allocation capability and an SD agent with autonomous decision-making ability, coordinated through an asynchronous architecture that enables adaptive responses to environmental changes. Invalid-action masking and carefully designed reward functions are further integrated to enhance training stability and convergence. Comprehensive evaluations over three representative topologies demonstrate that MADRL-JRASD reduces the blocking probability by up to 81% compared with RMSA heuristics without SD and achieves an 85% reduction in overhead relative to heuristic algorithms combining RMSA and SD that attain similar blocking performance. Moreover, the sensitivity analysis shows that the SD agent improves spectrum utilization and that multi-agent cooperation enhances global decision coordination, while action masking and reward design jointly strengthen the convergence and efficiency of MADRL-JRASD.

Original languageEnglish
Pages (from-to)180-194
Number of pages15
JournalJournal of Optical Communications and Networking
Volume18
Issue number3
DOIs
Publication statusPublished - Mar 2026
Externally publishedYes

Fingerprint

Dive into the research topics of 'Asynchronous cooperative multi-agent deep reinforcement learning for joint RMSA and spectrum defragmentation in optical fiber communication networks'. Together they form a unique fingerprint.

Cite this