TY - JOUR
T1 - Discrete Soft Actor-Critic Algorithm With Heuristic-Based Action Mapping for RMSCA in MCF-EONs
AU - Zhang, Xiao
AU - Tian, Qinghua
AU - Xin, Xiangjun
AU - Pan, Yiqun
AU - Yao, Haipeng
AU - Gao, Ran
AU - Zhang, Qi
N1 - Publisher Copyright:
© 1983-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - This paper proposes a novel deep reinforcement learning (DRL) architecture that enables effective decoupling between the agent and the environment for routing, modulation, spectrum, and core allocation (RMSCA) in multi-core fiber elastic optical networks (MCF-EONs). In the architecture, a heuristic-based action mapping layer (HAM) is designed between the agent and the environment. This layer maps the diverse action spaces of MCF-EONs into a unified and efficient space, providing the agent a stable and consistent interface. The HAM employs heuristic rules to filter and rank all possible decision options, ultimately selecting the top H high-quality candidate solutions for the agent to make decisions. Meanwhile, a general linear regression (LR) method is introduced to dynamically compute an optimal action space size H tailored to the specific scenario, improving the system’s flexibility and robustness across varying conditions. Finally, a reward function combining spectrum fragmentation and link load is designed to guide the agent in efficiently considering the state of spatial resource utilization. The proposed algorithm is evaluated under two different network topologies, various multi-core fibers, and traffic load conditions. The results show that, compared with advanced heuristic algorithms and DRL approaches, the proposed method reduces blocking probabilities by up to 89% and 83%, respectively, and demonstrates excellent generalization performance.
AB - This paper proposes a novel deep reinforcement learning (DRL) architecture that enables effective decoupling between the agent and the environment for routing, modulation, spectrum, and core allocation (RMSCA) in multi-core fiber elastic optical networks (MCF-EONs). In the architecture, a heuristic-based action mapping layer (HAM) is designed between the agent and the environment. This layer maps the diverse action spaces of MCF-EONs into a unified and efficient space, providing the agent a stable and consistent interface. The HAM employs heuristic rules to filter and rank all possible decision options, ultimately selecting the top H high-quality candidate solutions for the agent to make decisions. Meanwhile, a general linear regression (LR) method is introduced to dynamically compute an optimal action space size H tailored to the specific scenario, improving the system’s flexibility and robustness across varying conditions. Finally, a reward function combining spectrum fragmentation and link load is designed to guide the agent in efficiently considering the state of spatial resource utilization. The proposed algorithm is evaluated under two different network topologies, various multi-core fibers, and traffic load conditions. The results show that, compared with advanced heuristic algorithms and DRL approaches, the proposed method reduces blocking probabilities by up to 89% and 83%, respectively, and demonstrates excellent generalization performance.
KW - Action mapping
KW - deep reinforcement learning
KW - elastic optical networks
KW - multi core fibers
KW - resource allocation problem
UR - https://www.scopus.com/pages/publications/105017642166
U2 - 10.1109/JLT.2025.3617511
DO - 10.1109/JLT.2025.3617511
M3 - Article
AN - SCOPUS:105017642166
SN - 0733-8724
VL - 43
SP - 10849
EP - 10862
JO - Journal of Lightwave Technology
JF - Journal of Lightwave Technology
IS - 24
ER -