TY - JOUR
T1 - Causality-aware Graph Mixture of Experts for Accurate Multi-cloud Workload Prediction
AU - Luo, Yongcan
AU - Yu, Zhihao
AU - Zheng, Jiahao
AU - Yang, Zhengjie
AU - Sun, Lei
AU - Wu, Dapeng
N1 - Publisher Copyright:
© 2026 IEEE. All rights reserved.
PY - 2026
Y1 - 2026
N2 - Multi-cloud deployments are becoming increasingly necessary to meet data locality and compliance requirements, reduce latency for AI services, and mitigate the risk of single-provider failure. However, multi-cloud telemetry forms high-dimensional, heterogeneous and nonstationary time series where CPU, memory, disk, and LAN/WAN I/O exhibit time-varying, directional lead–lag effects across VMs and regions. Previous approaches, from RNNs to recent attention/GNN models, either assume linear stationarity or learn correlation-driven, largely symmetric dependencies that blur directed causal influence. To better leverage temporal dependencies and leverage multi-cloud deployments, we propose CAGMoE, a causality-aware dual-router Graph Mixture-of-Experts for multi-cloud workload forecasting. First, CAGMoE constructs two complementary graphs per window: an inter-metric graph derived from a transfer-entropy proxy of Granger causality to encode directed cross-metric influence, and an intra-temporal graph to capture local temporal continuity. Furthermore, a shared graph encoder produces token states and path summaries that drive dual routers to form Top-K sparse mixtures over experts. Finally, each expert is a FiLM-conditioned feed-forward network that injects a window-level causal vector to generate the final prediction. Experiments on real-world multi-cloud dataset MUCEP, Google Cluster, and Ali traces demonstrate that our method robustly compares with previous baselines, which improve in both accuracy and reliability, and demonstrate potential for industrial use.
AB - Multi-cloud deployments are becoming increasingly necessary to meet data locality and compliance requirements, reduce latency for AI services, and mitigate the risk of single-provider failure. However, multi-cloud telemetry forms high-dimensional, heterogeneous and nonstationary time series where CPU, memory, disk, and LAN/WAN I/O exhibit time-varying, directional lead–lag effects across VMs and regions. Previous approaches, from RNNs to recent attention/GNN models, either assume linear stationarity or learn correlation-driven, largely symmetric dependencies that blur directed causal influence. To better leverage temporal dependencies and leverage multi-cloud deployments, we propose CAGMoE, a causality-aware dual-router Graph Mixture-of-Experts for multi-cloud workload forecasting. First, CAGMoE constructs two complementary graphs per window: an inter-metric graph derived from a transfer-entropy proxy of Granger causality to encode directed cross-metric influence, and an intra-temporal graph to capture local temporal continuity. Furthermore, a shared graph encoder produces token states and path summaries that drive dual routers to form Top-K sparse mixtures over experts. Finally, each expert is a FiLM-conditioned feed-forward network that injects a window-level causal vector to generate the final prediction. Experiments on real-world multi-cloud dataset MUCEP, Google Cluster, and Ali traces demonstrate that our method robustly compares with previous baselines, which improve in both accuracy and reliability, and demonstrate potential for industrial use.
KW - Cloud computing
KW - graph neural network
KW - mixture of experts
KW - time series
UR - https://www.scopus.com/pages/publications/105038671658
U2 - 10.1109/TCC.2026.3689897
DO - 10.1109/TCC.2026.3689897
M3 - Article
AN - SCOPUS:105038671658
SN - 2168-7161
JO - IEEE Transactions on Cloud Computing
JF - IEEE Transactions on Cloud Computing
ER -